I just started learning Professor Li Hongyi's deep learning course and I'm recording the process of completing the experiments.
I will try to follow the hints from the course assignments and avoid using tricky techniques, aiming to use standard methods.
The metrics to be achieved are:
-
private
-
public
Simple#
Just run the code.
Submit results.
Medium#
Feature selection
By observing the training data, it can be seen that the 0th column is the id, which is not useful, so it just needs to be filtered out.
feat_idx = list(range(raw_x_train.shape[1]))
feat_idx = feat_idx[1:]
Result
Surprisingly, even the strong metrics have been passed.
Strong#
Medium also passed the strong metrics, so there is no need to make special adjustments to the model architecture and optimizer; I will continue using the Medium approach.
Boss#
I don't want to use too many tricky techniques, so I only made the following adjustments:
-
Batch size from 256 to 128 (an appropriate batch size adds a small noise to the loss, similar to annealing, which can provide an opportunity to converge to a better global generalization).
-
The model architecture was changed to include leaky ReLU and dropout. For various activation function learning processes, refer to: Understanding Activation Functions (Sigmoid/ReLU/LeakyReLU/PReLU/ELU) - Zhihu
self.layers = nn.Sequential( nn.Linear(input_dim, 32), nn.LeakyReLU(), nn.Linear(32, 64), nn.LeakyReLU(), nn.Dropout(0.1), nn.Linear(64, 1) )
-
The optimizer is
optimizer = torch.optim.SGD(model.parameters(), lr=config['learning_rate'], momentum=0.9, weight_decay=config['weight_decay'])
-
Feature selection module
from sklearn.feature_selection import SelectKBest from sklearn.feature_selection import f_regression features = pd.read_csv('./covid.train.csv') x_data, y_data = features.iloc[:, 0:117], features.iloc[:, 117] # Try to choose your k best features k = 24 selector = SelectKBest(score_func=f_regression, k=k) result = selector.fit(x_data, y_data) # result.scores_ includes scores for each feature # np.argsort sorts scores in ascending order by index, we reverse it to make it descending. idx = np.argsort(result.scores_)[::-1] print(f'Top {k} Best feature score ') print(result.scores_[idx[:k]]) print(f'\nTop {k} Best feature index ') print(idx[:k]) print(f'\nTop {k} Best feature name') print(x_data.columns[idx[:k]]) selected_idx = list(np.sort(idx[:k])) print(selected_idx) print(x_data.columns[selected_idx])
Since using KBest, the scores of the top 24 features are much greater than those of the remaining features, so I selected 24.
-
Data preprocessing: Min-max normalization, refer to: How to Understand Normalization? - Zhihu
# Normalization x_min, x_max = x_train.min(axis=0), x_train.max(axis=0) x_train = (x_train - x_min) / (x_max - x_min) x_valid = (x_valid - x_min) / (x_max - x_min) x_test = (x_test - x_min) / (x_max - x_min)
-
Parameter settings
config = { 'seed': 5201314, # Your seed number, you can pick your lucky number. :) 'select_all': False, # Whether to use all features. 'valid_ratio': 0.2, # validation_size = train_size * valid_ratio 'n_epochs': 3000, # Number of epochs. 'batch_size': 128, 'learning_rate': 1e-4, 'weight_decay': 1e-4, 'early_stop': 600, # If the model has not improved for this many consecutive epochs, stop training. 'save_path': './models/model.ckpt' # Your model will be saved here. }
In the end, the metrics on the private set are still a bit lacking, but I've spent too much time on it, so let's leave it at that for now. I hope to learn some standard methods during this process.
To achieve the boss baseline, you can refer to Machine Learning Artisan - CSDN Blog