30-Day TensorFlow & PyTorch Learning Plan
PyTorch and TensorFlow are the two most widely used deep learning frameworks, essential for building and deploying AI models.
PyTorch is known for its dynamic computation graph, making it highly flexible for research and experimentation. It provides an intuitive, Pythonic interface, making it popular in academia and among AI researchers. PyTorch is ideal for tasks like computer vision (via TorchVision), NLP (via Hugging Face transformers), and reinforcement learning.
TensorFlow, on the other hand, is optimized for large-scale deployments and production environments. It offers a static computation graph, improving performance and scalability. With TensorFlow Extended (TFX) and TensorFlow Lite, it supports end-to-end ML workflows, including mobile and edge deployment.
Both frameworks are critical for advancing AI applications, from research to real-world implementation, enabling innovations in healthcare, finance, and autonomous systems.
Week 1: Foundations (Tensors, Autograd, and Basic Models)
Day 1: Understanding Tensors
Concept: Learn what tensors are and how they work.
Tip: Practice tensor operations like addition, multiplication, and reshaping.
Example Code:
TensorFlow:
import tensorflow as tf
tensor = tf.constant([[1, 2], [3, 4]], dtype=tf.float32)
print(tf.reduce_sum(tensor))
PyTorch:
import torch
tensor = torch.tensor([[1, 2], [3, 4]], dtype=torch.float32)
print(torch.sum(tensor))
Day 2: Automatic Differentiation (Autograd & GradientTape)
Concept: Compute gradients automatically.
Tip: Use GradientTape (TF) and Autograd (PyTorch) for differentiation.
Example: Compute the derivative of .
TensorFlow:
x = tf.Variable(3.0)
with tf.GradientTape() as tape:
y = x**2
grad = tape.gradient(y, x)
print(grad) # 6.0
PyTorch:
x = torch.tensor(3.0, requires_grad=True)
y = x**2
y.backward()
print(x.grad) # 6.0
Day 3: Building a Simple Neural Network
Concept: Create a feedforward neural network.
Tip: Use Sequential API (TF Keras) and nn.Module (PyTorch).
TensorFlow:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
model = Sequential([
Dense(10, activation='relu', input_shape=(4,)),
Dense(1, activation='sigmoid')
])
model.summary()
PyTorch:
import torch.nn as nn
class SimpleNN(nn.Module):
def __init__(self):
super(SimpleNN, self).__init__()
self.fc1 = nn.Linear(4, 10)
self.fc2 = nn.Linear(10, 1)
def forward(self, x):
return torch.sigmoid(self.fc2(torch.relu(self.fc1(x))))
model = SimpleNN()
print(model)
Day 4: Activation Functions (ReLU, Sigmoid, Softmax)
Concept: Understand activation functions.
Tip: ReLU is preferred for hidden layers, Softmax for classification.
TensorFlow:
import tensorflow.nn as nn
print(nn.relu([-1.0, 2.0])) # [0.0, 2.0]
PyTorch:
import torch.nn.functional as F
print(F.relu(torch.tensor([-1.0, 2.0]))) # [0.0, 2.0]
Day 5: Loss Functions & Optimizers
Concept: Learn loss functions like MSE, CrossEntropy and optimizers like Adam, SGD.
Tip: Use Adam optimizer for adaptive learning.
TensorFlow:
from tensorflow.keras.optimizers import Adam
optimizer = Adam(learning_rate=0.001)
PyTorch:
import torch.optim as optim
optimizer = optim.Adam(model.parameters(), lr=0.001)
Day 6: Training a Model (Forward & Backward Pass)
Concept: Implement a full training step.
Tip: Use model.fit() (TF) and a custom loop (PyTorch).
TensorFlow:
model.compile(optimizer='adam', loss='binary_crossentropy')
model.fit(X_train, y_train, epochs=10, batch_size=32)
PyTorch:
for images, labels in train_loader:
optimizer.zero_grad()
outputs = model(images)
loss = criterion(outputs, labels)
loss.backward()
optimizer.step()
Day 7: Evaluating & Saving Models
Concept: Save and reload trained models.
TensorFlow:
model.save("model.h5")
model = tf.keras.models.load_model("model.h5")
PyTorch:
torch.save(model.state_dict(), "model.pth")
model.load_state_dict(torch.load("model.pth"))
Week 2: Deep Learning with CNNs and RNNs
Day 8-10: Convolutional Neural Networks (CNNs)
- Understanding convolutional layers, pooling, and feature extraction.
- Implementing CNNs from scratch in TensorFlow and PyTorch.
Day 11-13: Transfer Learning with Pretrained Models
- Using models like ResNet, VGG, and MobileNet.
- Fine-tuning and feature extraction for custom datasets.
Day 14: Data Augmentation Techniques
- Applying transformations like rotation, flipping, and cropping.
- Using
ImageDataGeneratorin TensorFlow andtorchvision.transformsin PyTorch.
Example Code:
TensorFlow:
from tensorflow.keras.preprocessing.image import ImageDataGenerator
datagen = ImageDataGenerator(rotation_range=40, horizontal_flip=True)
PyTorch:
import torchvision.transforms as transforms
transform = transforms.Compose([
transforms.RandomRotation(40),
transforms.RandomHorizontalFlip()
])
Week 3: NLP
& Sequence Models
In Week 3, we’ll explore key concepts and
models used in Natural Language Processing (NLP) and sequence modeling,
including RNNs, LSTMs, Transformers, and BERT. You'll also gain hands-on
experience with practical applications like sentiment analysis and text
classification.
Day 15-16:
RNNs & LSTMs
RNNs:
Recurrent Neural Networks
RNNs are designed to handle sequential data,
making them perfect for tasks like language modeling, speech recognition, and
time-series forecasting. However, they suffer from the vanishing gradient
problem when learning long-term dependencies.
Key Concepts:
- RNNs maintain hidden states that are passed across time steps.
- RNNs can struggle with long sequences because gradients can vanish
or explode.
Sample Code (TensorFlow for RNN):
import
tensorflow as tf
from
tensorflow.keras.models import Sequential
from
tensorflow.keras.layers import SimpleRNN, Dense
from
tensorflow.keras.optimizers import Adam
# Prepare
sequential data (example: simple sequence prediction)
X_train =
[[1], [2], [3], [4]]
y_train =
[[2], [3], [4], [5]]
# Define
the RNN model
model =
Sequential([
SimpleRNN(10, input_shape=(1, 1),
activation='relu'),
Dense(1)
])
model.compile(optimizer=Adam(),
loss='mse')
model.fit(X_train,
y_train, epochs=100)
# Make a
prediction
prediction
= model.predict([[5]])
print(prediction)
LSTMs: Long
Short-Term Memory
LSTM is a special type of RNN designed to
mitigate the vanishing gradient problem by using memory cells. These cells have
gates that control the flow of information, allowing LSTMs to capture long-term
dependencies.
Key Concepts:
- LSTM cells include input, forget, and output gates.
- They are capable of learning long-range dependencies in sequential
data.
Sample Code (TensorFlow for LSTM):
from
tensorflow.keras.layers import LSTM
# LSTM
model for a time-series forecasting task
model =
Sequential([
LSTM(50, input_shape=(1, 1),
activation='relu'),
Dense(1)
])
model.compile(optimizer=Adam(),
loss='mse')
model.fit(X_train,
y_train, epochs=100)
# Make a
prediction
prediction
= model.predict([[5]])
print(prediction)
Day 17-18:
Transformers & BERT
Transformers
The Transformer model, introduced in
"Attention is All You Need," revolutionized NLP by using
self-attention mechanisms. Unlike RNNs, Transformers don't process data
sequentially, allowing for parallelization and improved efficiency in capturing
long-range dependencies.
Key Concepts:
- Self-attention allows each word in a sequence to attend to every
other word.
- Multi-head attention enables the model to capture different
relationships simultaneously.
- Positional encoding is used to provide information about the order
of words.
Sample Code (TensorFlow for Transformer):
import
tensorflow as tf
from
tensorflow.keras.layers import Input, Dense, MultiHeadAttention,
LayerNormalization, Dropout
# Define
Transformer Block
def
transformer_block(inputs, num_heads=2, ff_dim=64):
attention =
MultiHeadAttention(num_heads=num_heads, key_dim=ff_dim)(inputs, inputs)
attention = LayerNormalization()(attention)
outputs = Dense(ff_dim,
activation='relu')(attention)
return outputs
# Example
Transformer Model
inputs =
Input(shape=(None, 64)) # Example input
x =
transformer_block(inputs)
x =
Dropout(0.1)(x)
outputs =
Dense(1)(x) # Output layer
model =
tf.keras.Model(inputs, outputs)
model.compile(optimizer='adam',
loss='mse')
BERT:
Bidirectional Encoder Representations from Transformers
BERT is a transformer-based model pre-trained
on large corpora and can be fine-tuned for specific tasks. Unlike traditional
models, BERT considers both the left and right context of a word.
Key Concepts:
- BERT uses a bidirectional approach to context.
- It can be fine-tuned for tasks like sentiment analysis, question
answering, and text classification.
Sample Code (TensorFlow for BERT):
from
transformers import BertTokenizer, TFBertForSequenceClassification
from
tensorflow.keras.optimizers import Adam
# Load
pre-trained BERT model
model =
TFBertForSequenceClassification.from_pretrained('bert-base-uncased')
# Load
tokenizer and encode text
tokenizer =
BertTokenizer.from_pretrained('bert-base-uncased')
inputs =
tokenizer("I love TensorFlow!", return_tensors="tf")
# Fine-tune
the BERT model on your task
model.compile(optimizer=Adam(learning_rate=1e-5),
loss='sparse_categorical_crossentropy', metrics=['accuracy'])
model.fit(inputs,
labels=[[1]]) # Example: label '1' as
positive sentiment
Day 19-20:
Sentiment Analysis & Text Classification
Sentiment
Analysis
Sentiment analysis involves classifying text
as expressing positive, negative, or neutral sentiment. We'll use BERT for this
task since it provides powerful pre-trained word embeddings.
Sample Code (PyTorch for Sentiment Analysis
using BERT):
import
torch
from
transformers import BertTokenizer, BertForSequenceClassification
from
torch.utils.data import DataLoader, TensorDataset
# Load BERT
tokenizer and model
tokenizer =
BertTokenizer.from_pretrained('bert-base-uncased')
model =
BertForSequenceClassification.from_pretrained('bert-base-uncased')
# Tokenize
the input text
inputs =
tokenizer("I love this product!", return_tensors="pt")
labels =
torch.tensor([1]) # Positive sentiment
label
# Create
DataLoader
dataset =
TensorDataset(inputs['input_ids'], labels)
dataloader
= DataLoader(dataset)
# Train the
model (example)
optimizer =
torch.optim.Adam(model.parameters(), lr=1e-5)
for batch
in dataloader:
input_ids, labels = batch
optimizer.zero_grad()
outputs = model(input_ids, labels=labels)
loss = outputs.loss
loss.backward()
optimizer.step()
Text
Classification
Text classification includes tasks like spam
detection or categorizing articles. We'll use a simple CNN model for this, but
you can extend this to more complex models like transformers.
Sample Code (TensorFlow for Text
Classification using CNN):
from
tensorflow.keras.layers import Conv1D, MaxPooling1D, Embedding, Flatten
# Example
CNN for Text Classification
model =
Sequential([
Embedding(input_dim=5000, output_dim=64,
input_length=100),
Conv1D(filters=128, kernel_size=5,
activation='relu'),
MaxPooling1D(pool_size=2),
Flatten(),
Dense(10, activation='softmax') # Example for multi-class classification
])
model.compile(optimizer='adam',
loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(X_train,
y_train, epochs=5)
Day 21-22:
Hands-on Projects
By the end of this week, you should have built
models using RNNs, LSTMs, Transformers, and BERT. You’ll also have hands-on
experience applying these models to NLP tasks like sentiment analysis and text
classification. For the next step, consider implementing an end-to-end project
involving text generation, time-series forecasting, or sentiment classification
on a large dataset.
This concludes Week 3 of the TensorFlow and
PyTorch Learning Plan. You’ll have gained a deep understanding of sequence
models and NLP, along with hands-on coding experience in these areas.
Here’s an expanded version of Week 4: Advanced Topics & Deployment, along with Days 17-18: Transformers & BERT, and Days 19-20: Sentiment Analysis & Text Classification, including sample code and key explanations:
Week 4: Advanced Topics & Deployment
Day 21: Hyperparameter Tuning
Hyperparameter tuning involves optimizing the settings of a model (e.g., learning rate, number of layers, units in layers) to improve its performance. We’ll use Keras Tuner for hyperparameter optimization.
Sample Code:
import tensorflow as tf
from tensorflow import keras
from keras_tuner import HyperModel, RandomSearch
# Define a hypermodel for a simple neural network
class MyHyperModel(HyperModel):
def build(self, hp):
model = keras.Sequential([
keras.layers.Dense(units=hp.Int('units', min_value=32, max_value=512, step=32),
activation='relu', input_dim=784),
keras.layers.Dense(10, activation='softmax')
])
model.compile(optimizer=keras.optimizers.Adam(),
loss='sparse_categorical_crossentropy', metrics=['accuracy'])
return model
# Set up hyperparameter tuning with Keras Tuner
tuner = RandomSearch(MyHyperModel(), objective='val_accuracy', max_trials=5, executions_per_trial=3, directory='tuner')
tuner.search(x_train, y_train, epochs=5, validation_data=(x_test, y_test))
# Get the best model
best_model = tuner.get_best_models(num_models=1)[0]
best_model.summary()
This code demonstrates using Keras Tuner for hyperparameter tuning by varying the number of units in the Dense layer. The RandomSearch method automatically tests different configurations.
Day 22-23: Generative Models (GANs, VAEs)
Generative Models learn to generate new data similar to the data they were trained on. We'll focus on GANs and VAEs.
Generative Adversarial Networks (GANs)
A GAN consists of two parts: a Generator that creates fake data and a Discriminator that tries to distinguish between real and fake data.
import tensorflow as tf
from tensorflow.keras import layers
# Generator model
def build_generator():
model = tf.keras.Sequential([
layers.Dense(128, activation="relu", input_shape=(100,)),
layers.Dense(256, activation="relu"),
layers.Dense(512, activation="relu"),
layers.Dense(1024, activation="relu"),
layers.Dense(784, activation="sigmoid"),
layers.Reshape((28, 28, 1))
])
return model
# Discriminator model
def build_discriminator():
model = tf.keras.Sequential([
layers.Flatten(input_shape=(28, 28, 1)),
layers.Dense(1024, activation="relu"),
layers.Dense(512, activation="relu"),
layers.Dense(256, activation="relu"),
layers.Dense(1, activation="sigmoid")
])
return model
# Create models
generator = build_generator()
discriminator = build_discriminator()
# Compile the Discriminator
discriminator.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
# Combine generator and discriminator into a GAN model
discriminator.trainable = False
gan_input = layers.Input(shape=(100,))
x = generator(gan_input)
gan_output = discriminator(x)
gan = tf.keras.models.Model(gan_input, gan_output)
# Compile GAN model
gan.compile(optimizer='adam', loss='binary_crossentropy')
This simple GAN model generates 28x28 images similar to those from the MNIST dataset. We use Adam optimizer and binary cross-entropy loss function.
Variational Autoencoders (VAEs)
VAEs are a type of autoencoder that learns to generate new data points from a latent space. Here’s an implementation using TensorFlow:
from tensorflow.keras import layers, models
def build_encoder(latent_dim):
inputs = layers.Input(shape=(784,))
x = layers.Dense(512, activation='relu')(inputs)
x = layers.Dense(256, activation='relu')(x)
z_mean = layers.Dense(latent_dim)(x)
z_log_var = layers.Dense(latent_dim)(x)
encoder = models.Model(inputs, [z_mean, z_log_var])
return encoder
def build_decoder(latent_dim):
latent_inputs = layers.Input(shape=(latent_dim,))
x = layers.Dense(256, activation='relu')(latent_inputs)
x = layers.Dense(512, activation='relu')(x)
x = layers.Dense(784, activation='sigmoid')(x)
decoder = models.Model(latent_inputs, x)
return decoder
latent_dim = 2
encoder = build_encoder(latent_dim)
decoder = build_decoder(latent_dim)
vae_input = layers.Input(shape=(784,))
z_mean, z_log_var = encoder(vae_input)
vae_output = decoder(z_mean)
vae = models.Model(vae_input, vae_output)
# VAE loss
reconstruction_loss = tf.keras.losses.binary_crossentropy(vae_input, vae_output) * 784
kl_loss = - 0.5 * tf.reduce_mean(z_log_var - tf.square(z_mean) - tf.exp(z_log_var) + 1)
vae_loss = tf.reduce_mean(reconstruction_loss + kl_loss)
vae.add_loss(vae_loss)
vae.compile(optimizer='adam')
This VAE architecture encodes inputs into a lower-dimensional latent space and then decodes them back to the original space.
Day 24: Deploying a Model with Flask or FastAPI
To deploy a model, you can use Flask or FastAPI. Here’s an example of deploying a model using Flask:
Flask Deployment Example:
from flask import Flask, request, jsonify
import tensorflow as tf
# Load the trained model
model = tf.keras.models.load_model('my_model.h5')
app = Flask(__name__)
@app.route('/predict', methods=['POST'])
def predict():
data = request.get_json()
image = data['image']
prediction = model.predict(image) # Ensure preprocessing steps
return jsonify({'prediction': prediction.tolist()})
if __name__ == '__main__':
app.run(debug=True)
This API will receive a POST request with the image data, process it, and return the model’s prediction.
Day 25: Model Optimization (Quantization, Pruning)
Quantization reduces the precision of the weights, which helps in reducing the model size and improving inference speed. Here’s how to apply quantization using TensorFlow Lite:
import tensorflow as tf
# Convert model to TensorFlow Lite format with optimizations
converter = tf.lite.TFLiteConverter.from_saved_model('my_model')
converter.optimizations = [tf.lite.Optimize.DEFAULT]
tflite_model = converter.convert()
# Save the quantized model
with open('model_quantized.tflite', 'wb') as f:
f.write(tflite_model)
This code reduces the model size while maintaining most of the accuracy.
Day 26-27: Reinforcement Learning
Reinforcement learning involves training an agent to make decisions by receiving feedback (rewards). Here’s a simple implementation using OpenAI Gym and TensorFlow.
import gym
import numpy as np
import tensorflow as tf
env = gym.make('CartPole-v1')
# Build a simple neural network for Q-function approximation
model = tf.keras.Sequential([
tf.keras.layers.Dense(24, activation='relu', input_shape=(env.observation_space.shape[0],)),
tf.keras.layers.Dense(24, activation='relu'),
tf.keras.layers.Dense(env.action_space.n)
])
optimizer = tf.keras.optimizers.Adam(learning_rate=0.001)
# Sample RL training loop (simplified)
for episode in range(1000):
state = env.reset()
state = np.reshape(state, [1, -1])
done = False
total_reward = 0
while not done:
action = np.argmax(model(state)) # Choose action
next_state, reward, done, _ = env.step(action)
total_reward += reward
# Update the Q-values here (not shown for brevity)
print(f'Episode {episode}, Total Reward: {total_reward}')
This code sets up a CartPole-v1 environment and a neural network to approximate the Q-values. The agent chooses actions based on the highest Q-value.
Day 28-30: End-to-End Project (Choose a topic & implement)
For the final project, pick a complex application involving what you’ve learned:
Project Idea: Build an Image Classification and Deployment Pipeline
-
Choose a dataset: Use a dataset like CIFAR-10 or MNIST.
-
Data Preprocessing: Normalize images and split them into training and validation sets.
-
Model Design: Use a CNN or a pre-trained model like ResNet or EfficientNet.
-
Hyperparameter Tuning: Use Keras Tuner to fine-tune the model.
-
Deployment: Create a Flask or FastAPI app to serve the trained model.
-
Optimization: Use TensorFlow Lite to optimize the model for faster inference.
Hands-on Activity: Deploy the end-to-end pipeline with all steps included: data preprocessing, model training, optimization,
This plan provides a structured approach to mastering TensorFlow & PyTorch over 30 days! 🚀
No comments:
Post a Comment