Skip to content

Commit

Permalink
Remove ordered list indicator
Browse files Browse the repository at this point in the history
  • Loading branch information
M.Notter committed Jan 18, 2025
1 parent 0c80986 commit 907dcef
Show file tree
Hide file tree
Showing 3 changed files with 19 additions and 19 deletions.
16 changes: 8 additions & 8 deletions _posts/2023-10-23-01_scikit_simple.md
Original file line number Diff line number Diff line change
Expand Up @@ -462,7 +462,7 @@ plt.close()

Before wrapping up, let's discuss some important pitfalls to avoid when working on classification tasks:

1. **Data Leakage**: Always split your data before any preprocessing or feature engineering
**Data Leakage**: Always split your data before any preprocessing or feature engineering

```python
# Wrong: Preprocessing before split
Expand All @@ -475,7 +475,7 @@ X_tr_scaled = preprocessing.scale(X_tr)
X_te_scaled = preprocessing.scale(X_te)
```

2. **Class Imbalance**: Always check your class distribution
**Class Imbalance**: Always check your class distribution

```python
# Using pandas for better visualization
Expand All @@ -493,7 +493,7 @@ plt.xlabel('Class')
plt.ylabel('Frequency (%)')
```

3. **Overfitting**: Monitor these warning signs
**Overfitting**: Monitor these warning signs
- Large gap between training and validation scores
- Perfect training accuracy (like we saw with RandomForest)
- Poor generalization to new data
Expand All @@ -507,7 +507,7 @@ print(f"CV Scores: {scores}")
print(f"Mean: {scores.mean():.3f}{scores.std()*2:.3f})")
```

4. **Memory Management**: For large datasets, consider these approaches
**Memory Management**: For large datasets, consider these approaches

```python
# Use n_jobs parameter for parallel processing
Expand All @@ -517,7 +517,7 @@ rf = RandomForestClassifier(n_jobs=-1) # Use all available cores
rf = RandomForestClassifier(max_samples=0.8) # Use 80% of samples per tree
```

5. **Feature Scaling**: Different algorithms have different scaling requirements
**Feature Scaling**: Different algorithms have different scaling requirements

```python
# SVM requires scaling, Random Forests don't
Expand All @@ -532,7 +532,7 @@ X_te_scaled = scaler.transform(X_te)
rf.fit(X_tr, y_tr) # No scaling needed
```

6. **Model Selection Bias**: Don't use test set for model selection
**Model Selection Bias**: Don't use test set for model selection

```python
# Wrong: Using test set for parameter tuning
Expand All @@ -546,7 +546,7 @@ grid.fit(X_tr, y_tr)
# Only use test set for final evaluation
```

7. **Model Troubleshooting Tips**
**Model Troubleshooting Tips**

```python
# Check for data issues first
Expand All @@ -564,7 +564,7 @@ if np.any(y_prob > 1.0) or np.any(y_prob < 0.0):
print("Warning: Invalid probability predictions!")
```

8. **Common Error Messages and Solutions**
**Common Error Messages and Solutions**
- `ValueError: Input contains NaN`: Clean your data before training
- `MemoryError`: Reduce batch size or use data generators

Expand Down
8 changes: 4 additions & 4 deletions _posts/2023-10-23-02_tensorflow_simple.md
Original file line number Diff line number Diff line change
Expand Up @@ -365,7 +365,7 @@ plt.show()
### Common Deep Learning Pitfalls
When starting with TensorFlow and neural networks, watch out for these common issues:

1. **Data Preparation**
**Data Preparation**
- (Almost) always scale input data (like we did with `/255.0`)
- Check for missing or invalid values
- Ensure consistent data types
Expand All @@ -376,7 +376,7 @@ x_train = x_train.astype('float32') / 255.0
x_test = x_test.astype('float32') / 255.0
```

2. **Model Architecture**
**Model Architecture**
- Start simple, add complexity only if needed
- Match output layer to your task (softmax for classification)
- Use appropriate layer sizes
Expand All @@ -391,7 +391,7 @@ model = keras.Sequential([
])
```

3. **Training Issues**
**Training Issues**
- Monitor training metrics (loss not decreasing)
- Watch for overfitting (validation loss increasing)
- Use appropriate batch sizes
Expand All @@ -406,7 +406,7 @@ history = model.fit(
)
```

4. **Memory Management**
**Memory Management**
- Clear unnecessary variables
- Use appropriate data types
- Watch batch sizes on limited hardware
Expand Down
14 changes: 7 additions & 7 deletions _posts/2023-10-23-04_tensorflow_advanced.md
Original file line number Diff line number Diff line change
Expand Up @@ -677,7 +677,7 @@ plot_history(history_file=history_file, title="Training overview of best model")

When working with complex neural networks and regression tasks, be aware of these advanced challenges:

1. **Gradient Issues**
**Gradient Issues**
- Vanishing/exploding gradients in deep networks
- Unstable training with certain architectures

Expand All @@ -694,7 +694,7 @@ x = layers.BatchNormalization()(x)
x = layers.ReLU()(x)
```

2. **Learning Rate Dynamics**
**Learning Rate Dynamics**
- Static learning rates often suboptimal
- Different layers may need different rates

Expand All @@ -720,7 +720,7 @@ def warmup_cosine_decay(step):
return tf.where(step < warmup_steps, warmup_rate, cosine_rate)
```

3. **Complex Loss Functions**
**Complex Loss Functions**
- Multiple objectives need careful weighting
- Custom losses require gradient consideration
- Handle edge cases and numerical stability
Expand All @@ -739,7 +739,7 @@ class WeightedMSE(keras.losses.Loss):
return tf.reduce_mean(weighted_errors, axis=-1)
```

4. **Data Pipeline Bottlenecks**
**Data Pipeline Bottlenecks**
- I/O can become training bottleneck
- Memory constraints with large datasets

Expand All @@ -757,7 +757,7 @@ def data_generator():
yield x_tr[i:i+batch_size], y_tr[i:i+batch_size]
```

5. **Model Architecture Complexity**
**Model Architecture Complexity**
- Deeper isn't always better
- Skip connections can help with gradient flow

Expand All @@ -774,7 +774,7 @@ def residual_block(x, filters):
return layers.ReLU()(x)
```

6. **Regularization Strategy**
**Regularization Strategy**
- Different layers may need different regularization
- Combine multiple regularization techniques

Expand All @@ -789,7 +789,7 @@ x = layers.BatchNormalization()(x)
x = layers.Dropout(0.5)(x)
```

7. **Model Debugging**
**Model Debugging**
- Add metrics to monitor internal states
- Use callbacks for detailed inspection
- Clear unused variables and models
Expand Down

0 comments on commit 907dcef

Please sign in to comment.