Digital Bootcamp - Intensive Class

Fullstack Data Science

"Master the Science of Data – Power Up Your Career in AI & Analytics"

gambar-kelass-datascience

📚 Kenapa harus belajar Fullstack Data Science?

👨‍🏫 Trainer yang akan mengajar 👩‍🏫

tim-Mentor-Data-Science

Yoshua C P

System, AI/ML & Data Expert

System Architecture, Machine/ Deep Learning, Big Data & BI, Cloud Computing.

foto-trainer-12

Dimas Rizky L

AI/ML & Data Specialist

Software & Data Engineering, Big Data, Data Science, AI/ML Development, Microservices.

foto-trainer-ke-11

Refanda S

AI/ML & Software Engineer

Software & Data Engineering, Big Data, Data Science, AI/ML Development, Microservices.

🎯 Target dan Sasaran kelas bootcamp ini

  • Menguasai skill data science secara holistik dengan mempelajari keterampilan teknis dan analitis, mulai dari pengolahan data hingga implementasi model AI, untuk menjadi data scientist yang siap kerja.
  • Meningkatkan kesiapan karir di bidang data dengan mengembangkan kemampuan dalam pemrograman, analisis data, machine learning, dan data engineering yang dibutuhkan oleh perusahaan di semua sektor industri.
  • Penguasaan alat dan teknik data science terbaru sesuai standar dan best practice industri.
  • Membangun portofolio proyek nyata yang dapat digunakan untuk menunjukkan keahlian dalam analisis data, storytelling, dan model prediktif di dunia profesional

💻 Topik yang akan dipelajari

  • Overview of Data Science & Its Role in Industry
  • Key Differences: Data Scientist vs. Data Analyst vs. Data Engineer
  • Data Science Workflow: From Data Collection to Deployment
  • Ethical Considerations in Data Science
  • Fundamentals of Artificial Intelligence & Machine Learning
  • Supervised vs. Unsupervised Learning
  • Bias & Fairness in AI Models
  • AI Model Deployment & MLOps Overview
  • Descriptive & Inferential Statistics
  • Probability Distributions & Hypothesis Testing
  • Correlation vs. Causation
  • Data Normalization & Outlier Detection
  • Python Basics for ML (NumPy, Pandas, Scikit-learn)
  • Feature Engineering & Data Preprocessing
  • Training & Evaluating ML Models
  • Model Optimization Techniques
  • Data Extraction from APIs, Databases, and Web Scraping
  • ETL (Extract, Transform, Load) Process
  • Data Cleaning & Preprocessing
  • Big Data Technologies Overview (Hadoop, Spark)
  • Data Modeling Techniques & Feature Selection
  • Using Looker, Flourish, and Power BI/Tableau
  • Effective Data Storytelling
  • Exploratory Data Analysis (EDA)
  • Data Wrangling Techniques
  • Time-Series Analysis
  • SQL for Data Science (Joins, CTE, Subqueries)
  • Query Optimization & Indexing
  • Combining SQL with Python for Analysis
  • Regression & Classification Techniques
  • Time-Series Forecasting
  • Evaluating Model Accuracy
  • Hands-on: Building a predictive model in Python
  • Automating Data Workflows
  • Building Scalable Data Pipelines with Apache Airflow
  • Cloud Data Engineering (AWS/GCP/Azure)
  • Hands-on: Implementing a real-time data pipeline
  • Tokenization, Lemmatization, and Sentiment Analysis
  • Named Entity Recognition (NER)
  • Word Embeddings & Transformer Models
  • Hands-on: Sentiment analysis on social media data
  • Understanding Retrieval-Augmented Generation (RAG)
  • Implementing LLMs for Data Science (ChatGPT, GPT-4, BERT)
  • Hands-on: Using OpenAI API for automated data insights
  • Projects
  • Assesment / Uji Kompetensi

🎁 Benefit yang didapat

ikon-dibimbing-it=expert

Dibimbing IT Expert &
Top Level Management Industri

ikon-fleksibelitas-program

Fleksibilitas Program dan
Fokus Skillset Tertentu

ikon-sertifikat

Sertifikat Diterbitkan CCIT FT-UI
(Universitas Indonesia)

ikon-belajar-dan-upgrade

3+ Bulan Belajar & Upgrade Skill Bareng Praktisi Top Industri

ikon-pendamping-24jam

Pendampingan Personal dan
24 Jam Akses Materi via LMS

ikon-bonus-eksklusif

Bonus Eksklusif 2 Materi Soft Skill
Buat Siap Kerja!

⚙️ Tools yang akan digunakan

jupyter-icon

Jupyter Notebook

gitlab-icon

Gitlab

tensorflow-icon

TensorFlow

docker-icon

Docker

power-bi-icon

Power BI

python-icon

Python

numpy-icon

NumPy

pandas-icon

Pandas

Scrapy

spark-icon

Apache Spark

airflow-icon

Apache Airflow

looker-icon

Looker

XGBoost

AWS Glue icon

AWS Glue

ggl-dataflow-icon

Google Dataflow

nltk-icon

NLTK

spacy-icon

spaCy

hugging-face-icon

Hugging Face Transformer

open-ai-icon

Open AI API

Kubernetes

Kubernetes

📝 Proyek yang akan dikerjakan

  • Solusi Aplikasi Media Monitoring berbasis AI
  • Solusi Aplikasi Fraud Detection berbasis AI.
  • Solusi Aplikasi Text Recoginition berbasis AI
  • Solusi Aplikasi Executive Dashboard

🏢 Prospek karir

📢 Untuk siapa kelas ini?

  • Mahasiswa (UI & Non UI) dan Umum yang ingin belajar dan memperkaya portofolio di bidang Data, Artificial Intellegent(AI) dan Machine Learning(ML)
  • Fresh Graduate Ingin memperkuat CV dengan skill praktis yang dicari industri 
  • Programmer yang ingin Upskilling bidang Data Scientist
  • Freelance yang sedang Reskilling / Upskilling 
  • Startup Enthusiast untuk jadi Data Scientist, ML Engineer, atau AI Specialist 
  • Manajer IT atau tim manajerial yang ingin memiliki insight mendalam tentang Data, AI dan ML

🕣 Jadwal

  1. Live Zoom setiap Selasa dan Kamis pukul 18.30 – 21.30 (Malam)
  2. Durasi kelas 2,5 – 3 jam per sesi selama 3 bulan.
  3. Kelas dimulai tanggal 12 Agustus 2025.

🗂️ Teknis Pelaksanaan

  1. Peserta yang melakukan pendaftaran, wajib join di group Whatsapp yang diberikan.
  2. Setiap sesi live akan dilaksanakan secara online menggunakan Zoom selama 24x pertemuan dilanjutkan dengan project portofolio dan bimbingan softskill untuk siap kerja dan pengembangan karir.
  3. Peserta wajib aktif di platform LMS (Learning Management System) baik dalam pembelajaran maupun forum kolaborasi.
  4. Peserta dapat mendownload ataupun mengakses materi belajar termasuk sampel source code (khusus kelas programming) di LMS.
  5. Peserta dapat bertanya dan berdiskusi dengan mentor dan peserta lain terkait materi, tugas dan konsultasi di LMS.

Premium

Rp.7.000.000

Umum

Rp.5.500.000

Mahasiswa

Rp.3.500.000

Module 1: Introduction Data Science

Topic: Overview of Data Science & Its Role in Industry

  • What is Data Science? Key Concepts & Applications
  • Evolution of Data Science & Industry Adoption
  • Emerging Trends: AI, ML, and Big Data in Business
  • Hand-on: Exploring real-world case studies of data science-driven decision-making

Topic: Key Differences: Data Scientist vs. Data Analyst vs. Data Engineer

  • Role Comparison: Responsibilities, Skills, and Tools Used
  • Career Paths & Specializations in Data Science
  • When to Hire a Data Scientist vs. Data Engineer vs. Data Analyst
  • How These Roles Work Together in a Data Team
  • Hand-on: Identifying real-world job descriptions & role alignment

Module 2: AI/ML Concept

Topic: Fundamentals of Artificial Intelligence & Machine Learning

  • What is AI? Evolution & Key Concepts
  • Machine Learning vs. Deep Learning vs. Traditional AI
  • Real-World Applications of AI/ML in Various Industries
  • Key ML Components: Data, Features, Algorithms, and Models
  • Hand-on: Exploring AI use cases & identifying business applications

Topic: Supervised vs. Unsupervised Learning

  • Supervised Learning:
  • Regression & Classification Techniques
  • Common Algorithms (Linear Regression, Decision Trees, SVM, etc.)
  • Model Evaluation & Performance Metrics (RMSE, Accuracy, Precision, Recall)
  • Unsupervised Learning:
  • Clustering & Dimensionality Reduction Techniques
  • K-Means, Hierarchical Clustering, PCA
  • Applications in Anomaly Detection & Customer
  • Segmentation Hand-on:
  • Implementing a Supervised ML Model using Python (Scikit-learn)
  • Performing Clustering Analysis on real-world data

Topic: Bias & Fairness in AI Models

  • Understanding Bias in Data & AI Models
  • Types of Bias: Selection Bias, Confirmation Bias, Algorithmic Bias
  • Fairness Metrics in AI: Demographic Parity, Equalized Odds
  • Techniques to Mitigate Bias in Machine Learning Models
  • Hand-on: Analyzing bias in an ML model and applying mitigation techniques

Topic: AI Model Deployment & MLOps Overview

  • Introduction to MLOps & AI Model Lifecycle
  • Model Deployment Strategies (On-Premise, Cloud, Edge AI)
  • CI/CD for Machine Learning Models
  • Monitoring AI Models in Production & Performance Optimization
  • Hand-on: Deploying an ML Model on a Cloud Platform (AWS, Azure, or Google Cloud)

Module 4: Machine Learning Programming with Phyton

Topic: Python Basics for ML (NumPy, Pandas, Scikit-learn)

  • Overview of Python for Data Science & Machine Learning
  • Introduction to NumPy: Arrays, Vectorized Operations, Indexing
  • Data Manipulation with Pandas: DataFrames, Cleaning, Aggregation
  • Machine Learning Basics with Scikit-learn
  • Exploratory Data Analysis (EDA) Techniques
  • Hand-on:
    – Working with real-world datasets using NumPy & Pandas
    – Implementing basic data exploration and visualization

Topic: Python Basics for ML (NumPy, Pandas, Scikit-learn)

  • Importance of Feature Engineering in ML
  • Handling Missing Data: Imputation Strategies
  • Feature Scaling & Normalization (Min-Max, Standardization)
  • Encoding Categorical Variables (One-Hot, Label Encoding)
  • Feature Selection Techniques (PCA, Mutual Information)
  • Hand-on: Preprocessing a dataset for ML using Scikit-learn pipelines

Topic: Python Basics for ML (NumPy, Pandas, Scikit-learn)

  • Understanding Train-Test Splitting & Cross-Validation
  • Common ML Algorithms:
    – Linear Regression, Logistic Regression
    – Decision Trees, Random Forest, Gradient Boosting
    – k-Nearest Neighbors, Support Vector Machines (SVM)
  • Evaluating Model Performance:
    – Metrics for Regression (RMSE, R²)
    – Metrics for Classification (Accuracy, Precision, Recall, F1-score, AUC-ROC)
  • Hand-on: Training & Evaluating Multiple ML Models on a Business Dataset

Module 5: Data Engineering: from collecting to transformation

Topic: Data Cleaning & Preprocessing

  • Handling Missing & Duplicate Data
    – Imputation Techniques (Mean, Median, Mode)
    – Dropping vs. Filling Missing Values
  • Data Standardization & Normalization
    – Min-Max Scaling, Z-score Standardization
    – Handling Categorical Data (One-Hot Encoding, Label Encoding)
  • Detecting & Handling Outliers
    – Z-score, IQR Method, Winsorization
  • Ensuring Data Integrity
    – Data Consistency Checks & Schema Validation
  • Hand-on:
    – Cleaning a messy dataset and preparing it for machine learning

Topic: Big Data Technologies Overview (Hadoop, Spark)

  • Introduction to Big Data & Distributed Computing
    – What is Big Data? 3Vs (Volume, Velocity, Variety)
    – Traditional Databases vs. Distributed Systems
  • Hadoop Ecosystem Overview
    – HDFS, MapReduce, Apache Hive, Apache Pig
  • Apache Spark for Data Processing
    – Spark vs. Hadoop: Key Differences
    – Spark DataFrames & RDDs
    – Parallel Computing & Performance Optimization
  • Cloud-based Big Data Solutions
    – AWS EMR, Google BigQuery, Databricks
  • Hand-on:
    – Running data transformations using PySpark

Module 6: Data Modelling & Visualization

Topic: Using Looker, Flourish, and Power BI/Tableau

  • Introduction to Data Visualization
    – Importance of Visual Analytics in Decision-Making
    – Choosing the Right Chart Type for Different Data
  • Hands-on with Looker & Flourish
    – Looker: Building Dashboards & Real-time Analytics
    – Flourish: Creating Interactive & Storytelling-based Visuals
  • Power BI & Tableau for Enterprise Reporting
    – Connecting to Data Sources & Data Cleaning
    – Designing Interactive Dashboards & Reports
    – Advanced Features (DAX for Power BI, Calculated Fields for Tableau)
  • Hand-on:
    – Creating visual report & interactive dashboards in Power BI/Tableau
    – Designing an engaging data story with Flourish

Topic: Effective Data Storytelling

  • Principles of Data Storytelling
    – Structuring a Data Narrative (Hook, Data, Insights, Action)
    – Balancing Data & Context for Decision-Makers
    – Using Visual Cues & Cognitive Load Reduction Techniques
  • Techniques for Communicating Insights
    – Simplifying Complex Data for Non-Technical Audiences
    – Storyboarding & Slide Design for Presentations
    – Creating Engaging Reports with Infographics & Dynamic Charts
  • Hand-on:
    – Presenting a data-driven business case using storytelling techniques

Module 7: Data Analysis with Phyton

Topic: Data Wrangling Techniques

  • Understanding Data Wrangling
    – Difference between Data Cleaning, Transformation, and Feature Engineering
    – Common Challenges in Raw Data
  • Handling Missing & Inconsistent Data
    – Imputation Techniques (Mean, Median, Mode, Forward Fill, KNN)
    – Dealing with Duplicates & Anomalies
  • Transforming & Reshaping Data
    – Pivot Tables, Melting, Grouping, and Aggregation
    – Encoding Categorical Variables (One-Hot, Label Encoding)
    – Feature Scaling & Normalization (Min-Max Scaling, Standardization)
  • Automating Data Wrangling Workflows
    – Using Python Libraries: Pandas, NumPy, Dask
    – Writing Efficient Data Pipelines
  • Hand-on: Cleaning and transforming messy datasets using Python (Pandas, NumPy, Scikit-learn)

Module 8: Applied Data Science with SQL

Topic: SQL for Data Science (Joins, CTE, Subqueries)

  • Introduction to SQL for Data Science
    – Role of SQL in Data Science Workflows
    – SQL vs. NoSQL: When to Use Each
  • Data Querying with SQL
    – Basic Queries: SELECT, WHERE, GROUP BY, ORDER BY
    – Filtering & Aggregating Data
  • Advanced SQL Queries
    – Using Joins (INNER, LEFT, RIGHT, FULL OUTER) for Data Merging
    – Using Common Table Expressions (CTEs) for Readability & Optimization
    – Using Subqueries for Complex Data Analysis
    – Window Functions (RANK, DENSE_RANK, LAG, LEAD)
  • Hands-on:
    – Writing advanced SQL queries on real-world datasets using PostgreSQL/MySQL
    – Using CTEs and subqueries for multi-step analysis

Topic: Query Optimization & Indexing

  • Understanding SQL Query Performance
    – How SQL Queries are Executed (Query Execution Plan)
    – Common Performance Bottlenecks
  • Optimizing SQL Queries
    – Indexing Strategies (Clustered vs. Non-Clustered Indexes)
    – Query Rewriting Techniques (Avoiding SELECT *)
    – Using Partitioning for Large Datasets
  • Database Optimization Techniques
    – Caching Strategies & Materialized Views
    – Using EXPLAIN ANALYZE for Query Debugging
  • Hands-on:
    – Optimizing slow queries using EXPLAIN ANALYZE
    – Implementing indexes & partitioning in PostgreSQL/MySQL

Module 9: Predictive Analysis

Topic: Time-Series Forecasting

  • Introduction to Time-Series Forecasting
    – Time-Series Data vs. Traditional Datasets
    – Components of Time-Series (Trend, Seasonality, Noise)
  • Exploratory Data Analysis for Time-Series
    – Visualizing Trends & Seasonality (Line Plots, Decomposition)
    – Stationarity Testing (Augmented Dickey-Fuller Test)
  • Time-Series Forecasting Techniques
    – Moving Averages & Exponential Smoothing
    – ARIMA & SARIMA Models
    – Machine Learning for Forecasting (XGBoost, LSTMs)
  • Hands-on:
    – Forecasting sales, stock prices, or demand using ARIMA & LSTMs
    – Evaluating time-series models with Mean Absolute Error (MAE) & RMSE

Module 10: Data Pipeline

Topic: Automating Data Workflows

  • Introduction to Data Workflow Automation
    – The Importance of Automation in Data Science
    – Batch vs. Real-time Data Processing
    – Common Tools for Workflow Automation (Airflow, Prefect, Luigi)
  • Scheduling & Orchestrating Automated Tasks
    – Creating Task Pipelines for Data Extraction & Transformation
    – Implementing Monitoring & Logging for Data Workflows
  • Hands-on: Automating ETL (Extract, Transform, Load) Pipelines
    – Building a simple automated ETL pipeline
    – Automating Data Cleaning & Transformation with Python

Topic: Building Scalable Data Pipelines with Apache Airflow

  • Introduction to Apache Airflow
    – Why Airflow? Benefits for Data Pipeline Management
    – DAGs (Directed Acyclic Graphs) & Task Dependencies
  • Designing Scalable Data Pipelines
    – Creating & Scheduling DAGs for Data Processing
    – Managing Task Failures & Retries
    – Parallel Processing & Performance Optimization
  • Hands-on: Deploying Airflow Pipelines
    – Writing Python-based DAGs for Workflow Orchestration
    – Integrating Airflow with SQL, APIs, and Cloud Storage

Topic: Cloud Data Engineering (AWS/GCP/Azure)

  • Overview of Cloud Data Engineering
    – Cloud Data Services: AWS, GCP, Azure
    – Serverless vs. Managed Data Services
  • Building Cloud-based Data Pipelines
    – AWS: Using AWS Glue, Lambda, S3, Redshift
    – GCP: Using BigQuery, Cloud Functions, Dataflow
    – Azure: Using Azure Data Factory, Synapse Analytics
  • Security & Cost Optimization in Cloud Data Engineering
    – Role-based Access Control (RBAC)
    – Cost-efficient Storage & Compute Strategies
  • Hands-on: Deploying a Cloud-based Data Pipeline
    – Setting up an end-to-end pipeline using AWS/GCP/Azure
    – Automating data ingestion & transformation in the cloud

Topic: Hands-on: Implementing a real-time data pipeline

Module 12: RAG & LLM

Topic: Understanding Retrieval-Augmented Generation (RAG)

  • Introduction to Retrieval-Augmented Generation (RAG)
    – How RAG Combines Retrieval & Generation for AI-powered Applications
    – Benefits of RAG over Traditional LLMs
    – Real-world Use Cases (Search Engines, Knowledge Assistants, Q&A Systems)
  • RAG Pipeline Architecture
    – Understanding the Role of Vector Databases (FAISS, Pinecone, Weaviate)
    – How LLMs Retrieve and Generate Contextually Accurate Responses
  • Hands-on: Implementing a Basic RAG System
    – Setting up a Simple RAG Model with OpenAI API & FAISS

Topic: Implementing LLMs for Data Science (ChatGPT, GPT-4, BERT)

  • Overview of Large Language Models (LLMs)
    – Understanding ChatGPT, GPT-4, BERT, and T5
    – Differences Between Fine-tuned vs. Zero-shot Models
  • LLM Applications in Data Science
    – Automating Data Cleaning, Feature Engineering, and Insights Generation
    – Using LLMs for Code Generation & Debugging
  • Fine-tuning & Prompt Engineering for LLMs
    – How to Customize LLMs for Domain-specific Tasks
    – Optimizing Prompting Techniques for Improved Response Accuracy
  • Hands-on: Deploying an LLM-powered Data Science Assistant
    – Using OpenAI API for Text Summarization, Classification, and Question Answering
    – Integrating LLMs with Pandas, SQL, and Visualization Tools

Topic: Hands-on: Using OpenAI API for automated data insights

Module 1: Introduction Data Science

Topic: Data Science Workflow: From Data Collection to Deployment

  • “End-to-End Data Science Process: – Data Collection & Sourcing Strategies – Data Cleaning & Preprocessing – Exploratory Data Analysis (EDA) – Feature Engineering & Model Building – Model Evaluation & Optimization – Deployment & Monitoring (MLOps)”
  • Tools & Technologies Used in the Workflow (Python, SQL, Cloud Platforms, etc.)
  • Hand-on: Mapping the Data Science workflow for a given business scenario

Topic: Ethical Considerations in Data Science

  • Importance of Ethical AI & Responsible Data Usage
  • Data Privacy Regulations (GDPR, CCPA, and Local Laws)
  • Bias & Fairness in AI/ML Models
  • Transparency & Explainability in Data-Driven Decision-Making
  • Hand-on: Case study on biased AI models and how to mitigate bias

Module 3: Basic Statistics for Data Science

Topic: Descriptive & Inferential Statistics

  • Descriptive Statistics:
  • Measures of Central Tendency (Mean, Median, Mode) Measures of Dispersion (Variance, Standard Deviation, Range, IQR)
  • Data Visualization (Histograms, Box Plots, Scatter Plots)
  • Inferential Statistics:
  • Sampling Methods & Central Limit Theorem
  • Confidence Intervals & Margin of Error
  • Parametric vs. Non-Parametric Tests
  • Hand-on: Calculating descriptive statistics & visualizing distributions using Python (Pandas, Matplotlib, Seaborn)

Topic: Probability Distributions & Hypothesis Testing

  • Probability Distributions:
  • Normal, Binomial, Poisson, and Exponential Distributions
  • Skewness & Kurtosis in Data Distribution
  • Probability Density & Cumulative Distribution Functions
  • Hypothesis Testing:
  • Null vs. Alternative Hypothesis
  • One-Tailed vs. Two-Tailed Tests
  • p-values, Significance Levels, and Type I/II Errors
  • t-Tests, ANOVA, Chi-Square Test
  • Hand-on: Conducting hypothesis testing on real-world datasets using Python (Scipy, Statsmodels)

Module 3: Basic Statistics for Data Science

Topic: Correlation vs. Causation

  • Correlation Analysis:
    – Pearson, Spearman, and Kendall Correlation Coefficients
    – Interpreting Correlation Matrices
    – Pitfalls of Spurious Correlations
  • Causation & Causal Inference:
    – Observational vs. Experimental Studies
    – A/B Testing & Randomized Control Trials
    – Simpson’s Paradox & Confounding Variables
  • Hand-on: Performing correlation analysis on business data & visualizing relationships

Topic: Correlation vs. Causation

  • Data Normalization:
    – Standardization (Z-score) vs. Min-Max Scaling
    – When to Normalize Data & Impact on Machine Learning Models
  • Outlier Detection:
    – Box Plot & Z-Score Method
    – Mahalanobis Distance & Isolation Forest
    – Handling Outliers: Removal vs. Transformation
  • Hand-on: Detecting & handling outliers in a dataset using Python (Scikit-learn)

Module 4: Machine Learning Programming with Phyton

Topic: Model Optimization Techniques

  • Hyperparameter Tuning: Grid Search vs. Random Search
  • Regularization Techniques (L1, L2, Elastic Net)
  • Ensemble Learning: Bagging vs. Boosting (XGBoost, LightGBM)
  • Bias-Variance Tradeoff & Overfitting Prevention
  • Hand-on: Implementing hyperparameter tuning using Scikit-learn & Optuna

Module 5: Data Engineering: from collecting to transformation

Topic: Data Extraction from APIs, Databases, and Web Scraping

  • Data Extraction from APIs
    – Understanding RESTful APIs
    – Authentication & API Tokens (OAuth, API Keys)
    – Extracting Data using Python (Requests, JSON, Pandas)
  • Database Extraction (SQL & NoSQL)
    – Connecting to Databases (PostgreSQL, MySQL, MongoDB)
    – Querying & Extracting Large Datasets
  • Web Scraping Fundamentals
    – HTML & XPath Basics
    – Scraping with BeautifulSoup & Selenium
    – Ethical Considerations & Legal Aspects
  • Hand-on:
    – Extracting real-time data from a public API (e.g., Twitter, OpenWeather)
    – Scraping e-commerce data using BeautifulSoup

Topic: ETL (Extract, Transform, Load) Process

  • Understanding the ETL Pipeline
    – Extracting Data from Multiple Sources
    – Data Transformation Techniques (Aggregation, Normalization)
    – Data Loading Strategies (Batch vs. Streaming)
  • ETL Tools & Frameworks
    – Apache Airflow for Workflow Automation
    – Data Pipelines with Python (Pandas, PySpark)
    – Cloud-based ETL (AWS Glue, Google Dataflow)
  • Hand-on:
    – Building an ETL pipeline to extract, clean, and load data into a database

Module 6: Data Modelling & Visualization

Topic: Data Modeling Techniques & Feature Selection

  • Understanding Data Modeling
    – Conceptual, Logical, and Physical Data Models
    – Data Relationships & Entity-Relationship Diagrams (ERD)
  • Feature Engineering & Selection
    – Importance of Feature Engineering in ML Models
    – Feature Transformation Techniques (Polynomial Features, Log Transform)
    – Dimensionality Reduction (PCA, t-SNE, Autoencoders)
    – Feature Importance Techniques (SHAP, LIME, Mutual Information)
  • Hand-on:
    – Applying feature selection techniques on a real-world dataset
    – Implementing PCA for dimensionality reduction

Module 7: Data Analysis with Phyton

Topic: Exploratory Data Analysis (EDA)

  • Introduction to EDA
    – Importance of EDA in Data Science
    – Understanding the Data Lifecycle
  • Summary Statistics & Data Distribution
    – Measures of Central Tendency & Dispersion (Mean, Median, Variance)
    – Data Distribution & Normality Tests
    – Identifying Outliers using Boxplots, Z-score, and IQR
  • Data Visualization for EDA
    – Univariate Analysis (Histograms, KDE, Boxplots)
    – Bivariate & Multivariate Analysis (Scatter Plots, Heatmaps, Pair Plots)
    – Correlation Analysis & Feature Relationships

Module 7: Data Analysis with Phyton

Topic: Time-Series Analysis

  • Fundamentals of Time-Series Data
    – Time-Series vs. Traditional Datasets
    – Components of Time-Series Data (Trend, Seasonality, Residual)
  • Time-Series Data Preprocessing
    – Handling Missing Values in Time-Series
    – Resampling & Aggregating Time-Series Data
    – Rolling Statistics & Moving Averages
  • Exploratory Analysis for Time-Series
    – Visualizing Time-Series Trends (Line Plots, Decomposition)
    – Autocorrelation & Partial Autocorrelation Functions (ACF & PACF)
  • Introduction to Time-Series Forecasting
    – Classical Models: ARIMA, SARIMA, Exponential Smoothing
    – Machine Learning for Time-Series (XGBoost, LSTMs)
  • Hand-on:
    – Analyzing stock market, weather, or sales data using Statsmodels & Prophet
    – Building an ARIMA-based forecasting model
    Cleaning and transforming messy datasets using Python (Pandas, NumPy, Scikit-learn)

Module 8: Applied Data Science with SQL

Topic: Combining SQL with Python for Analysis

  • Integrating SQL with Python
    – Using Pandas & SQLAlchemy for Database Queries
    – Connecting to Databases via PostgreSQL, MySQL, SQLite
  • Automating Data Analysis Workflows
    – Writing & Executing Queries with Python
    – Storing & Processing Large Datasets using Pandas & SQL
  • Building a Data Pipeline with SQL & Python
    – Extracting Data with SQL Queries
    – Transforming Data with Pandas
    – Exporting Processed Data for Machine Learning
  • Hands-on:
    – Writing Python scripts to fetch & process SQL data using SQLAlchemy & Pandas
    – Automating ETL workflows with SQL & Python

Module 9: Predictive Analysis

Topic: Regression & Classification Techniques

  • Introduction to Regression & Classification
    – Understanding Supervised Learning
    – Key Differences: Regression vs. Classification
    – Common Use Cases in Business & Industry
  • Regression Techniques
    – Simple & Multiple Linear Regression
    – Polynomial Regression
    – Regularization Methods (Ridge, Lasso, ElasticNet)
  • Classification Techniques
    – Logistic Regression
    – Decision Trees & Random Forests
    – Support Vector Machines (SVM)
    – Neural Networks for Classification (Intro to Deep Learning)
  • Hands-on:
    – Building Regression & Classification Models using Scikit-learn
    – Evaluating Model Performance with RMSE, R², Precision-Recall, F1-score

Module 9: Predictive Analysis

Topic: Evaluating Model Accuracy

  • Understanding Model Performance Metrics
    – Regression Metrics: MSE, RMSE, R² Score
    – Classification Metrics: Accuracy, Precision, Recall, F1-score, ROC-AUC
    – Forecasting Metrics: MAE, RMSE, MAPE
  • Bias-Variance Tradeoff
    – Overfitting vs. Underfitting
    – Techniques to Improve Generalization
  • Hyperparameter Tuning & Model Optimization
    – Grid Search vs. Random Search
    – Cross-Validation Strategies
  • Hands-on:
    – Comparing multiple models using Hyperparameter Tuning & Cross-Validation
    – Visualizing confusion matrices & ROC curves

Topic: Hands-on: Building a predictive model in Python

Module 11: Text Analytic NLP

Topic: Tokenization, Lemmatization, and Sentiment Analysis

  • Introduction to NLP & Text Processing
    – Understanding Unstructured Data
    – NLP Pipeline & Preprocessing Steps
  • Tokenization & Text Normalization
    – Word vs. Subword Tokenization
    – Sentence Segmentation & Stopword Removal
    – Hands-on with NLTK & SpaCy
  • Lemmatization & Stemming
    – Differences & Use Cases
    – Implementing Lemmatization with Python
  • Sentiment Analysis Techniques
    – Rule-based vs. Machine Learning Approaches
    – Implementing Sentiment Analysis with VADER & TextBlob
    – Hands-on: Building a Simple Sentiment Classifier

Topic: Named Entity Recognition (NER)

  • Understanding Named Entity Recognition (NER)
    – Types of Named Entities (People, Organizations, Locations, etc.)
    – Applications in Industry (Chatbots, Search Engines, Finance)
  • Building an NER Model with SpaCy & Hugging Face
    – Rule-based vs. Machine Learning-based NER
    – Pre-trained Models vs. Custom Training
  • Hands-on: Implementing NER
    – Extracting entities from real-world datasets
    – Fine-tuning an NER model using transformer-based approaches

Topic: Word Embeddings & Transformer Models

  • Understanding Word Embeddings
    – Word2Vec, GloVe, FastText
    – Contextual vs. Non-contextual Embeddings
  • Transformer-based Language Models
    – Introduction to Transformer Architecture
    – Understanding BERT, GPT, and T5 Models
  • Fine-tuning Transformer Models for NLP Tasks
    – Sentiment Analysis, NER, Text Classification
    – Hands-on: Fine-tuning BERT for Custom NLP Tasks

Topic: Hands-on: Sentiment analysis on social media data

Module 13: Project Portofolio

Topic: Projects

Topic: Assesment / Uji Kompetensi

Frequently Asked Question

Tidak. Kursus ini dirancang dan disesuaikan  untuk pemula , mahasiswa, umum dan profesional tanpa latar belakang IT. Materi disusun secara bertahap, mulai dari dasar hingga tingkat lanjut, sehingga dapat diikuti oleh siapa saja.

Ya. Setelah menyelesaikan seluruh materi dan tugas yang diberikan, Anda akan menerima sertifikat resmi dikeluarkan oleh CCIT FT Universitas Indonesia (UI) yang dapat digunakan untuk melamar pekerjaan atau menambah portofolio profesional.

Kursus ini menggunakan metode blended learning, yaitu kombinasi antara:

  • Belajar mandiri melalui platform e-learning, di mana peserta dapat mengakses materi, video, dan tugas kapan saja.
  • Virtual meet via Zoom (live session) bersama mentor, dijadwalkan secara rutin untuk diskusi, tanya jawab, atau membahas topik penting secara interaktif.
    Metode ini memberikan fleksibilitas belajar sekaligus pengalaman interaktif dengan pendampingan mentor.

Ya. Kami menyediakan forum diskusi, sesi tanya jawab bersama mentor, serta dukungan teknis untuk membantu Anda selama proses belajar.

Untuk kursus secara umum (selain Mobile Development), perangkat minimal yang disarankan adalah:

  • Prosesor: Minimal Dual-core, seperti Intel Core i3 generasi ke-6 atau AMD Ryzen 3 2200U
  • RAM: Minimal 4GB (disarankan 8GB)
  • Sistem Operasi: Windows 10, macOS 10.13 atau versi lebih baru
  • Koneksi Internet: Stabil, minimal 10 Mbps

    Untuk kursus Mobile Development dan Game Development, disarankan:
  • Prosesor: Quad-core, seperti Intel Core i5 generasi ke-8 atau AMD Ryzen 5 3500U
  • RAM: Minimal 8GB (disarankan 12GB atau lebih)
  • Penyimpanan: SSD minimal 256GB

Ya. Kursus ini bekerja sama dengan CCIT FT Universitas Indonesia, sehingga sertifikat yang diterbitkan memiliki kredibilitas tinggi dan dapat menjadi nilai tambah pada CV Anda.

Durasi kelas intensive bootcamp adalah 3 bulan, dengan sesi live melalui Zoom 2 kali dalam seminggu, masing-masing berdurasi 3 jam. Jadwal berlangsung pada hari kerja (weekdays) pukul 19.00 – 22.00 WIB atau hari libur (weekend) pukul 09.00 - 12.00

Durasi kelas fast track adalah 5 hari, dengan sesi live melalui Zoom 5 kali dalam seminggu, masing-masing berdurasi 3 jam. Jadwal berlangsung pada hari kerja (weekdays) pukul 19.00 – 22.00 WIB atau hari libur (weekend) pukul 09.00 - 12.00

Ya. Materi kursus dapat diakses kapan saja melalui platform LMS atau LXP, sehingga Anda bisa belajar secara fleksibel di luar jadwal live session.

Ya. Tugas diberikan di setiap akhir pertemuan. Selain itu, peserta akan mengerjakan proyek nyata (real project) sebagai bagian dari proses belajar dan portofolio.

Ya. Kursus ini berbayar, namun Anda akan mendapatkan akses seumur hidup ke seluruh materi pembelajaran, termasuk video, modul, dan forum diskusi.

Ya. Peserta akan mendapatkan bimbingan langsung dari mentor profesional, serta akses ke group chat khusus untuk berdiskusi dan berkonsultasi.

Tidak ada tes masuk untuk dapat mengikuti program di Digiskill Hub, semua orang dengan latar belakang apapun dapat mengikuti program ini

Ingin belajar skill digital menarik lainnya?

Kami juga ada program Intensive Bootcamp lainnya — pembelajaran mendalam dan langsung praktik bareng mentor!