Tuesday, 18 March 2025

30-Day TensorFlow & PyTorch Learning Plan

 30-Day TensorFlow & PyTorch Learning Plan







PyTorch and TensorFlow are the two most widely used deep learning frameworks, essential for building and deploying AI models.

PyTorch is known for its dynamic computation graph, making it highly flexible for research and experimentation. It provides an intuitive, Pythonic interface, making it popular in academia and among AI researchers. PyTorch is ideal for tasks like computer vision (via TorchVision), NLP (via Hugging Face transformers), and reinforcement learning.

TensorFlow, on the other hand, is optimized for large-scale deployments and production environments. It offers a static computation graph, improving performance and scalability. With TensorFlow Extended (TFX) and TensorFlow Lite, it supports end-to-end ML workflows, including mobile and edge deployment.

Both frameworks are critical for advancing AI applications, from research to real-world implementation, enabling innovations in healthcare, finance, and autonomous systems.

Week 1: Foundations (Tensors, Autograd, and Basic Models)

Day 1: Understanding Tensors

Concept: Learn what tensors are and how they work.
Tip: Practice tensor operations like addition, multiplication, and reshaping.
Example Code:

TensorFlow:

import tensorflow as tf
tensor = tf.constant([[1, 2], [3, 4]], dtype=tf.float32)
print(tf.reduce_sum(tensor))

PyTorch:

import torch
tensor = torch.tensor([[1, 2], [3, 4]], dtype=torch.float32)
print(torch.sum(tensor))

Day 2: Automatic Differentiation (Autograd & GradientTape)

Concept: Compute gradients automatically.
Tip: Use GradientTape (TF) and Autograd (PyTorch) for differentiation.
Example: Compute the derivative of y=x2y = x^2.

TensorFlow:

x = tf.Variable(3.0)
with tf.GradientTape() as tape:
    y = x**2
grad = tape.gradient(y, x)
print(grad)  # 6.0

PyTorch:

x = torch.tensor(3.0, requires_grad=True)
y = x**2
y.backward()
print(x.grad)  # 6.0

Day 3: Building a Simple Neural Network

Concept: Create a feedforward neural network.
Tip: Use Sequential API (TF Keras) and nn.Module (PyTorch).

TensorFlow:

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense

model = Sequential([
    Dense(10, activation='relu', input_shape=(4,)),
    Dense(1, activation='sigmoid')
])
model.summary()

PyTorch:

import torch.nn as nn

class SimpleNN(nn.Module):
    def __init__(self):
        super(SimpleNN, self).__init__()
        self.fc1 = nn.Linear(4, 10)
        self.fc2 = nn.Linear(10, 1)

    def forward(self, x):
        return torch.sigmoid(self.fc2(torch.relu(self.fc1(x))))

model = SimpleNN()
print(model)

Day 4: Activation Functions (ReLU, Sigmoid, Softmax)

Concept: Understand activation functions.
Tip: ReLU is preferred for hidden layers, Softmax for classification.

TensorFlow:

import tensorflow.nn as nn
print(nn.relu([-1.0, 2.0]))  # [0.0, 2.0]

PyTorch:

import torch.nn.functional as F
print(F.relu(torch.tensor([-1.0, 2.0])))  # [0.0, 2.0]

Day 5: Loss Functions & Optimizers

Concept: Learn loss functions like MSE, CrossEntropy and optimizers like Adam, SGD.
Tip: Use Adam optimizer for adaptive learning.

TensorFlow:

from tensorflow.keras.optimizers import Adam
optimizer = Adam(learning_rate=0.001)

PyTorch:

import torch.optim as optim
optimizer = optim.Adam(model.parameters(), lr=0.001)

Day 6: Training a Model (Forward & Backward Pass)

Concept: Implement a full training step.
Tip: Use model.fit() (TF) and a custom loop (PyTorch).

TensorFlow:

model.compile(optimizer='adam', loss='binary_crossentropy')
model.fit(X_train, y_train, epochs=10, batch_size=32)

PyTorch:

for images, labels in train_loader:
    optimizer.zero_grad()
    outputs = model(images)
    loss = criterion(outputs, labels)
    loss.backward()
    optimizer.step()

Day 7: Evaluating & Saving Models

Concept: Save and reload trained models.

TensorFlow:

model.save("model.h5")
model = tf.keras.models.load_model("model.h5")

PyTorch:

torch.save(model.state_dict(), "model.pth")
model.load_state_dict(torch.load("model.pth"))

Week 2: Deep Learning with CNNs and RNNs

Day 8-10: Convolutional Neural Networks (CNNs)

  • Understanding convolutional layers, pooling, and feature extraction.
  • Implementing CNNs from scratch in TensorFlow and PyTorch.

Day 11-13: Transfer Learning with Pretrained Models

  • Using models like ResNet, VGG, and MobileNet.
  • Fine-tuning and feature extraction for custom datasets.

Day 14: Data Augmentation Techniques

  • Applying transformations like rotation, flipping, and cropping.
  • Using ImageDataGenerator in TensorFlow and torchvision.transforms in PyTorch.

Example Code:

TensorFlow:

from tensorflow.keras.preprocessing.image import ImageDataGenerator

datagen = ImageDataGenerator(rotation_range=40, horizontal_flip=True)

PyTorch:

import torchvision.transforms as transforms

transform = transforms.Compose([
    transforms.RandomRotation(40),
    transforms.RandomHorizontalFlip()
])

Week 3: NLP & Sequence Models

In Week 3, we’ll explore key concepts and models used in Natural Language Processing (NLP) and sequence modeling, including RNNs, LSTMs, Transformers, and BERT. You'll also gain hands-on experience with practical applications like sentiment analysis and text classification.

Day 15-16: RNNs & LSTMs

RNNs: Recurrent Neural Networks

RNNs are designed to handle sequential data, making them perfect for tasks like language modeling, speech recognition, and time-series forecasting. However, they suffer from the vanishing gradient problem when learning long-term dependencies.

Key Concepts:

  • RNNs maintain hidden states that are passed across time steps.
  • RNNs can struggle with long sequences because gradients can vanish or explode.

Sample Code (TensorFlow for RNN):

import tensorflow as tf

from tensorflow.keras.models import Sequential

from tensorflow.keras.layers import SimpleRNN, Dense

from tensorflow.keras.optimizers import Adam

 

# Prepare sequential data (example: simple sequence prediction)

X_train = [[1], [2], [3], [4]]

y_train = [[2], [3], [4], [5]]

 

# Define the RNN model

model = Sequential([

    SimpleRNN(10, input_shape=(1, 1), activation='relu'),

    Dense(1)

])

 

model.compile(optimizer=Adam(), loss='mse')

model.fit(X_train, y_train, epochs=100)

 

# Make a prediction

prediction = model.predict([[5]])

print(prediction)

LSTMs: Long Short-Term Memory

LSTM is a special type of RNN designed to mitigate the vanishing gradient problem by using memory cells. These cells have gates that control the flow of information, allowing LSTMs to capture long-term dependencies.

Key Concepts:

  • LSTM cells include input, forget, and output gates.
  • They are capable of learning long-range dependencies in sequential data.

Sample Code (TensorFlow for LSTM):

from tensorflow.keras.layers import LSTM

 

# LSTM model for a time-series forecasting task

model = Sequential([

    LSTM(50, input_shape=(1, 1), activation='relu'),

    Dense(1)

])

 

model.compile(optimizer=Adam(), loss='mse')

model.fit(X_train, y_train, epochs=100)

 

# Make a prediction

prediction = model.predict([[5]])

print(prediction)


Day 17-18: Transformers & BERT

Transformers

The Transformer model, introduced in "Attention is All You Need," revolutionized NLP by using self-attention mechanisms. Unlike RNNs, Transformers don't process data sequentially, allowing for parallelization and improved efficiency in capturing long-range dependencies.

Key Concepts:

  • Self-attention allows each word in a sequence to attend to every other word.
  • Multi-head attention enables the model to capture different relationships simultaneously.
  • Positional encoding is used to provide information about the order of words.

Sample Code (TensorFlow for Transformer):

import tensorflow as tf

from tensorflow.keras.layers import Input, Dense, MultiHeadAttention, LayerNormalization, Dropout

 

# Define Transformer Block

def transformer_block(inputs, num_heads=2, ff_dim=64):

    attention = MultiHeadAttention(num_heads=num_heads, key_dim=ff_dim)(inputs, inputs)

    attention = LayerNormalization()(attention)

    outputs = Dense(ff_dim, activation='relu')(attention)

    return outputs

 

# Example Transformer Model

inputs = Input(shape=(None, 64))  # Example input

x = transformer_block(inputs)

x = Dropout(0.1)(x)

outputs = Dense(1)(x)  # Output layer

model = tf.keras.Model(inputs, outputs)

 

model.compile(optimizer='adam', loss='mse')

BERT: Bidirectional Encoder Representations from Transformers

BERT is a transformer-based model pre-trained on large corpora and can be fine-tuned for specific tasks. Unlike traditional models, BERT considers both the left and right context of a word.

Key Concepts:

  • BERT uses a bidirectional approach to context.
  • It can be fine-tuned for tasks like sentiment analysis, question answering, and text classification.

Sample Code (TensorFlow for BERT):

from transformers import BertTokenizer, TFBertForSequenceClassification

from tensorflow.keras.optimizers import Adam

 

# Load pre-trained BERT model

model = TFBertForSequenceClassification.from_pretrained('bert-base-uncased')

 

# Load tokenizer and encode text

tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')

inputs = tokenizer("I love TensorFlow!", return_tensors="tf")

 

# Fine-tune the BERT model on your task

model.compile(optimizer=Adam(learning_rate=1e-5), loss='sparse_categorical_crossentropy', metrics=['accuracy'])

model.fit(inputs, labels=[[1]])  # Example: label '1' as positive sentiment


Day 19-20: Sentiment Analysis & Text Classification

Sentiment Analysis

Sentiment analysis involves classifying text as expressing positive, negative, or neutral sentiment. We'll use BERT for this task since it provides powerful pre-trained word embeddings.

Sample Code (PyTorch for Sentiment Analysis using BERT):

import torch

from transformers import BertTokenizer, BertForSequenceClassification

from torch.utils.data import DataLoader, TensorDataset

 

# Load BERT tokenizer and model

tokenizer = BertTokenizer.from_pretrained('bert-base-uncased')

model = BertForSequenceClassification.from_pretrained('bert-base-uncased')

 

# Tokenize the input text

inputs = tokenizer("I love this product!", return_tensors="pt")

labels = torch.tensor([1])  # Positive sentiment label

 

# Create DataLoader

dataset = TensorDataset(inputs['input_ids'], labels)

dataloader = DataLoader(dataset)

 

# Train the model (example)

optimizer = torch.optim.Adam(model.parameters(), lr=1e-5)

for batch in dataloader:

    input_ids, labels = batch

    optimizer.zero_grad()

    outputs = model(input_ids, labels=labels)

    loss = outputs.loss

    loss.backward()

    optimizer.step()

Text Classification

Text classification includes tasks like spam detection or categorizing articles. We'll use a simple CNN model for this, but you can extend this to more complex models like transformers.

Sample Code (TensorFlow for Text Classification using CNN):

from tensorflow.keras.layers import Conv1D, MaxPooling1D, Embedding, Flatten

 

# Example CNN for Text Classification

model = Sequential([

    Embedding(input_dim=5000, output_dim=64, input_length=100),

    Conv1D(filters=128, kernel_size=5, activation='relu'),

    MaxPooling1D(pool_size=2),

    Flatten(),

    Dense(10, activation='softmax')  # Example for multi-class classification

])

 

model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

model.fit(X_train, y_train, epochs=5)


Day 21-22: Hands-on Projects

By the end of this week, you should have built models using RNNs, LSTMs, Transformers, and BERT. You’ll also have hands-on experience applying these models to NLP tasks like sentiment analysis and text classification. For the next step, consider implementing an end-to-end project involving text generation, time-series forecasting, or sentiment classification on a large dataset.


This concludes Week 3 of the TensorFlow and PyTorch Learning Plan. You’ll have gained a deep understanding of sequence models and NLP, along with hands-on coding experience in these areas.

 Here’s an expanded version of Week 4: Advanced Topics & Deployment, along with Days 17-18: Transformers & BERT, and Days 19-20: Sentiment Analysis & Text Classification, including sample code and key explanations:


Week 4: Advanced Topics & Deployment


Day 21: Hyperparameter Tuning

Hyperparameter tuning involves optimizing the settings of a model (e.g., learning rate, number of layers, units in layers) to improve its performance. We’ll use Keras Tuner for hyperparameter optimization.

Sample Code:

import tensorflow as tf
from tensorflow import keras
from keras_tuner import HyperModel, RandomSearch

# Define a hypermodel for a simple neural network
class MyHyperModel(HyperModel):
    def build(self, hp):
        model = keras.Sequential([
            keras.layers.Dense(units=hp.Int('units', min_value=32, max_value=512, step=32), 
                               activation='relu', input_dim=784),
            keras.layers.Dense(10, activation='softmax')
        ])
        model.compile(optimizer=keras.optimizers.Adam(),
                      loss='sparse_categorical_crossentropy', metrics=['accuracy'])
        return model

# Set up hyperparameter tuning with Keras Tuner
tuner = RandomSearch(MyHyperModel(), objective='val_accuracy', max_trials=5, executions_per_trial=3, directory='tuner')
tuner.search(x_train, y_train, epochs=5, validation_data=(x_test, y_test))

# Get the best model
best_model = tuner.get_best_models(num_models=1)[0]
best_model.summary()

This code demonstrates using Keras Tuner for hyperparameter tuning by varying the number of units in the Dense layer. The RandomSearch method automatically tests different configurations.


Day 22-23: Generative Models (GANs, VAEs)

Generative Models learn to generate new data similar to the data they were trained on. We'll focus on GANs and VAEs.

Generative Adversarial Networks (GANs)

A GAN consists of two parts: a Generator that creates fake data and a Discriminator that tries to distinguish between real and fake data.

import tensorflow as tf
from tensorflow.keras import layers

# Generator model
def build_generator():
    model = tf.keras.Sequential([
        layers.Dense(128, activation="relu", input_shape=(100,)),
        layers.Dense(256, activation="relu"),
        layers.Dense(512, activation="relu"),
        layers.Dense(1024, activation="relu"),
        layers.Dense(784, activation="sigmoid"),
        layers.Reshape((28, 28, 1))
    ])
    return model

# Discriminator model
def build_discriminator():
    model = tf.keras.Sequential([
        layers.Flatten(input_shape=(28, 28, 1)),
        layers.Dense(1024, activation="relu"),
        layers.Dense(512, activation="relu"),
        layers.Dense(256, activation="relu"),
        layers.Dense(1, activation="sigmoid")
    ])
    return model

# Create models
generator = build_generator()
discriminator = build_discriminator()

# Compile the Discriminator
discriminator.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])

# Combine generator and discriminator into a GAN model
discriminator.trainable = False
gan_input = layers.Input(shape=(100,))
x = generator(gan_input)
gan_output = discriminator(x)
gan = tf.keras.models.Model(gan_input, gan_output)

# Compile GAN model
gan.compile(optimizer='adam', loss='binary_crossentropy')

This simple GAN model generates 28x28 images similar to those from the MNIST dataset. We use Adam optimizer and binary cross-entropy loss function.

Variational Autoencoders (VAEs)

VAEs are a type of autoencoder that learns to generate new data points from a latent space. Here’s an implementation using TensorFlow:

from tensorflow.keras import layers, models

def build_encoder(latent_dim):
    inputs = layers.Input(shape=(784,))
    x = layers.Dense(512, activation='relu')(inputs)
    x = layers.Dense(256, activation='relu')(x)
    z_mean = layers.Dense(latent_dim)(x)
    z_log_var = layers.Dense(latent_dim)(x)
    encoder = models.Model(inputs, [z_mean, z_log_var])
    return encoder

def build_decoder(latent_dim):
    latent_inputs = layers.Input(shape=(latent_dim,))
    x = layers.Dense(256, activation='relu')(latent_inputs)
    x = layers.Dense(512, activation='relu')(x)
    x = layers.Dense(784, activation='sigmoid')(x)
    decoder = models.Model(latent_inputs, x)
    return decoder

latent_dim = 2
encoder = build_encoder(latent_dim)
decoder = build_decoder(latent_dim)

vae_input = layers.Input(shape=(784,))
z_mean, z_log_var = encoder(vae_input)
vae_output = decoder(z_mean)
vae = models.Model(vae_input, vae_output)

# VAE loss
reconstruction_loss = tf.keras.losses.binary_crossentropy(vae_input, vae_output) * 784
kl_loss = - 0.5 * tf.reduce_mean(z_log_var - tf.square(z_mean) - tf.exp(z_log_var) + 1)
vae_loss = tf.reduce_mean(reconstruction_loss + kl_loss)

vae.add_loss(vae_loss)
vae.compile(optimizer='adam')

This VAE architecture encodes inputs into a lower-dimensional latent space and then decodes them back to the original space.


Day 24: Deploying a Model with Flask or FastAPI

To deploy a model, you can use Flask or FastAPI. Here’s an example of deploying a model using Flask:

Flask Deployment Example:

from flask import Flask, request, jsonify
import tensorflow as tf

# Load the trained model
model = tf.keras.models.load_model('my_model.h5')

app = Flask(__name__)

@app.route('/predict', methods=['POST'])
def predict():
    data = request.get_json()
    image = data['image']
    prediction = model.predict(image)  # Ensure preprocessing steps
    return jsonify({'prediction': prediction.tolist()})

if __name__ == '__main__':
    app.run(debug=True)

This API will receive a POST request with the image data, process it, and return the model’s prediction.


Day 25: Model Optimization (Quantization, Pruning)

Quantization reduces the precision of the weights, which helps in reducing the model size and improving inference speed. Here’s how to apply quantization using TensorFlow Lite:

import tensorflow as tf

# Convert model to TensorFlow Lite format with optimizations
converter = tf.lite.TFLiteConverter.from_saved_model('my_model')
converter.optimizations = [tf.lite.Optimize.DEFAULT]
tflite_model = converter.convert()

# Save the quantized model
with open('model_quantized.tflite', 'wb') as f:
    f.write(tflite_model)

This code reduces the model size while maintaining most of the accuracy.


Day 26-27: Reinforcement Learning

Reinforcement learning involves training an agent to make decisions by receiving feedback (rewards). Here’s a simple implementation using OpenAI Gym and TensorFlow.

import gym
import numpy as np
import tensorflow as tf

env = gym.make('CartPole-v1')

# Build a simple neural network for Q-function approximation
model = tf.keras.Sequential([
    tf.keras.layers.Dense(24, activation='relu', input_shape=(env.observation_space.shape[0],)),
    tf.keras.layers.Dense(24, activation='relu'),
    tf.keras.layers.Dense(env.action_space.n)
])

optimizer = tf.keras.optimizers.Adam(learning_rate=0.001)

# Sample RL training loop (simplified)
for episode in range(1000):
    state = env.reset()
    state = np.reshape(state, [1, -1])
    done = False
    total_reward = 0
    
    while not done:
        action = np.argmax(model(state))  # Choose action
        next_state, reward, done, _ = env.step(action)
        total_reward += reward
        # Update the Q-values here (not shown for brevity)
    print(f'Episode {episode}, Total Reward: {total_reward}')

This code sets up a CartPole-v1 environment and a neural network to approximate the Q-values. The agent chooses actions based on the highest Q-value.


Day 28-30: End-to-End Project (Choose a topic & implement)

For the final project, pick a complex application involving what you’ve learned:

Project Idea: Build an Image Classification and Deployment Pipeline

  1. Choose a dataset: Use a dataset like CIFAR-10 or MNIST.

  2. Data Preprocessing: Normalize images and split them into training and validation sets.

  3. Model Design: Use a CNN or a pre-trained model like ResNet or EfficientNet.

  4. Hyperparameter Tuning: Use Keras Tuner to fine-tune the model.

  5. Deployment: Create a Flask or FastAPI app to serve the trained model.

  6. Optimization: Use TensorFlow Lite to optimize the model for faster inference.

Hands-on Activity: Deploy the end-to-end pipeline with all steps included: data preprocessing, model training, optimization,

This plan provides a structured approach to mastering TensorFlow & PyTorch over 30 days! 🚀

No comments:

Post a Comment

AI and Industrial IOT Solution

In this blog article we will discuss Industrial  IOT Solution Architecture and how AI can help achieve huge prodictivity. 1. Master Data Syn...