Convert a bipartite dataset to a simpler global-single output format.
Convert a bipartite interaction problem, where there are two feature
matrices in X (one for each axis) and an interaction matrix y to a simpler
usual format where each sample is a combination of samples from X[0] and
X[1].
If X is a list of Xi feature matrices, one for each bipartite group,
convert it to traditional data format by generating concatenations of
rows from X[0] with rows from X[1].
Pick one row from each of the 2D arrays in X, in their presented order, and
concatenate them. Repeat. Return a 2D array where its rows are all the
possible combinations of rows in X.
Parameters:
X (list-like of 2D np.ndarrays) –
Returns:
result – Cartesian product of X’s 2d arrays, row-wise.
Fit the k-nearest neighbors regressor from the training dataset.
:param X: Training data.
:type X: {array-like, sparse matrix} of shape (n_samples, n_features) or (n_samples, n_samples) if metric=’precomputed’
:param y: Target values.
:type y: {array-like, sparse matrix} of shape (n_samples,) or (n_samples, n_outputs)
Predict the target for the provided data.
:param X: Test samples.
:type X: {array-like, sparse matrix} of shape (n_queries, n_features), or (n_queries, n_indexed) if metric == ‘precomputed’
Returns:
y – Target values.
Return type:
ndarray of shape (n_queries,) or (n_queries, n_outputs), dtype=int
Employ the GSO strategy to adapt sstandard estimators to bipartite data.
In this strategy, the estimator is applied to concatenations of a feature
vector from the first sample domain with a feature vector from the second
domain, while y is considered a unidimensional vector.
Implements the Local Multi-Output strategy for adapting estimators.
This wrapper facilitates the implementation of the local multi-output
approach to adapt monopartite estimators to bipartite scenarios. In this
approach, four multi-output estimators are aggregated.
The training procedure (calling fit(X_train,y)) consists simply of:
Train a primary rows estimator on X_train[0] and y_train.
Train a primary columns estimator on X_train[1] and y_train.T.
The prediction procedure then utilities the predictions of the primary
estimators in order to be able to make predictions on completely new
interactions. predict(X_test) will perform the following steps:
Use self.primary_cols_estimator_ to predict new columns for
the interaction matrix, that correspond to the targets of
X_test[0].
Use self.primary_rows_estimator_ to predict new rows for the
interaction matrix, that correspond to the targets of X_test[1].
Fit the secondary rows estimator on the newly predicted columns
and X_test[0].
Fit the secondary columns estimator on the newly predicted rows
and X_test[1].
Combine the predictions of the secondary estimators using
If self.independent_labels is False, then the original
training data is appended to the training data of the secondary
estimators in step 2, allowing the secondary estimators to explore
inter-output correlations.
See the User Guide for a diagram and
more information.
Note that the secondary estimators must be refit every time the
wrapper’s predict() method is called, which may increase prediction
time depending on the type of secondary estimators chosen by the user.
Compositions of single-output estimators can also be used
instead of multi-output estimators, which can be implemented with
scikit-learn wrappers such as MultiOutputRegressor or
MultiOutputClassifier. This could be an interesting option in
cases where the base estimators do not natively support multiple
outputs.
A wrapper that fits a single-output estimator to bipartite datasets.
MultiOutputRegressor
A scikit-learn wrapper that fits a separate regressor for each output variable.
MultiOutputClassifier
A scikit-learn wrapper that fits a separate classifier for each output variable.
Examples
frombipartite_learn.datasetsimportNuclearReceptorsLoaderfrombipartite_learn.wrappersimportLocalMultiOutputWrapperfromsklearn.svmimportSVCfromsklearn.neighborsimportKNeighborsClassifierfromsklearn.multioutputimportMultiOutputClassifierX,y=NuclearReceptorsLoader().load()# X is a list of two matricesbipartite_clf=LocalMultiOutputWrapper(primary_rows_estimator=MultiOutputClassifier(SVC()),primary_cols_estimator=MultiOutputClassifier(SVC()),secondary_rows_estimator=KNeighborsClassifier(),secondary_cols_estimator=KNeighborsClassifier(),)bipartite_clf.fit(X,y)
IncompatibleEstimatorsError – If any of the estimators passed as arguments does not support
multi-output functionality. If the secondary estimators are not of
the same type (e.g., regressor, classifier). If only one of the
primary estimators is pairwise.