import pandas as pd
from sklearn.metrics import mean_squared_error, r2_score
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
df=pd.read_csv('https://raw.githubusercontent.com/dataprofessor/data/master/delaney_solubility_with_descriptors.csv')
X = df.drop('logS', axis=1)
y = df['logS']
X_train, y_train, X_test, y_test = train_test_split(X, y, test_size=0.2, random_state=1)
lr = LinearRegression()
lr.fit(X_train, y_train)
y_lr_train_pred = lr.predict(X_train)
y_lr_test_pred = lr.predict(X_test)
lr_train_mse = mean_squared_error(y_train, y_lr_train_pred)
lr_train_r2 = r2_score(y_train, y_lr_train_pred)
lr_test_mse = mean_squared_error(y_test, y_lr_test_pred)
lr_test_r2 = r2_score(y_test, y_lr_test_pred)
print(lr_train_mse)
lr_results = pd.DataFrame(['Linear regression',lr_train_mse, lr_train_r2, lr_test_mse, lr_test_r2]).transpose()
lr_results.columns = ['Method','Training MSE','Training R2','Test MSE','Test R2']
我有一个代码,试图预测logS值。这个代码不是我的,而是来自指南。那里没有错误。那么问题出在哪里呢?
full error message here:
Traceback (most recent call last):
File "D:\Python\AI\test\main.py", line 14, in <module>
lr.fit(X_train, y_train)
File "D:\Python\AI\test\venv\lib\site-packages\sklearn\base.py", line 1152, in wrapper
return fit_method(estimator, *args, **kwargs)
File "D:\Python\AI\test\venv\lib\site-packages\sklearn\linear_model\_base.py", line 678, in fit
X, y = self._validate_data(
File "D:\Python\AI\test\venv\lib\site-packages\sklearn\base.py", line 622, in _validate_data
X, y = check_X_y(X, y, **check_params)
File "D:\Python\AI\test\venv\lib\site-packages\sklearn\utils\validation.py", line 1164, in check_X_y
check_consistent_length(X, y)
File "D:\Python\AI\test\venv\lib\site-packages\sklearn\utils\validation.py", line 407, in check_consistent_length
raise ValueError(
ValueError: Found input variables with inconsistent numbers of samples: [915, 229]
我更改了数据文件,ValueError中的值也发生了更改。
我更改了test_size值,ValueError中的值发生了更改
对于test_size=0.4:
ValueError: Found input variables with inconsistent numbers of samples: [686, 458]
:/