Welcome to the graduate course on Cloud Computing
Cloud computing serves many large-scale applications ranging from search engines like Google to social networking websites like Facebook to online stores like Amazon. More recently, cloud computing has emerged as an essential technology to enable emerging fields such as Artificial Intelligence (AI), the Internet of Things (IoT), and Machine Learning. The exponential growth of data availability and demands for security and speed has made the cloud computing paradigm necessary for reliable, financially economical, and scalable computation. The dynamicity and flexibility of Cloud computing have opened up many new forms of deploying applications on infrastructure that cloud service providers offer, such as renting of computation resources and serverless computing.
This course will cover the fundamentals of cloud services management and cloud software development, including but not limited to design patterns, application programming interfaces, and underlying middleware technologies. More specifically, we will cover the topics of cloud computing service models, data centers resource management, task scheduling, resource virtualization, SLAs, cloud security, software defined networks and storage, cloud storage, and programming models. We will also discuss data center design and management strategies, which enable the economic and technological benefits of cloud computing. Lastly, we will study cloud storage concepts like data distribution, durability, consistency, and redundancy.
Lecture Info
- Instructor: Ali Anwar
- Meeting time: T,Th 09:45 AM - 11:00 AM
- Location: Elliott Hall N119
- Email: aanwar_AT_umn.edu
Prerequisite
Undergraduate-level Operating System or a similar course is preferred. This course assumes familiarity with systems, including how operating systems work and how networks work. If you don’t have this background, you should be in communication with me (the instructor) at the beginning of the semester.
Who is this course for?
This course is primarily intended for graduate students and motivated seniors who want to learn the latest research advances in distributed systems and cloud computing areas, and are interested in building cloud-based distributed systems used/demanded by existing/emerging data-intensive applications.
Grading policy
Your grade will be calculated as follows:
- 10% homework assignments (potential bonus points as well)
- 10% class participation (potential bonus points as well)
- 30% coding assignments
- 50% research projects
Grading rules
The final grade is computed according to the following rules:
- A+: A: [90%, 95%); A-: [85%, 90%)
- B+: [80%, 85%); B: [75%, 80%); B-: [70%, 75%)
- C+: [66%, 70%); C: [63%, 66%); C-: [60%, 63%)
- D+: [56%, 60%); D:[53%, 56%); D-: [50%, 53%)
- F: < 50%
Late Days
Each student will be granted 5 late days to use for the homework. After all free late days are used up, the penalty is 10% for each additional late day. Late days can not be used for the mid-way and final research presentation.
Student Integrity
Cheating in this course will result in a grade of F for the course and the University policies will be followed.
Students with Disabilities
If you need any accommodation, you are highly encouraged to contact both your instructor and Disability Resources Center (DRC).
Office hours
- Where: 4-205 Keller hall or Zoom
- When: Wednesday 10 -12 pm
- Additional office hours can be requested via Slack
Course Schedule
The course schedule is tentative and subject to change.
Lecture |
Date |
Topic |
Required Readings |
Optional Readings |
Announcements |
1 |
Tue: Sept 5
|
Introduction and Logistics |
Armbrust2010, Vaquero11
|
Rackspace2012, Shafii2012, DeanSOSP2015, Cano2016, Vieira2012, Vogels2016, reiss2012, Ferguson2012, Rajagopalan2013, Das2013 |
Assignment 1: student info |
2 |
Thu: Sept 7 |
Introduction to Cloud Computing |
//
|
// |
|
3 |
Tue: Sept 12 |
Virtualization I |
VMware2007, Xen2005
|
VMware2006,
VMware2007,
Xen2003,
Xen2005
|
|
4 |
Thu: Sept 14 |
Virtualization II |
//
|
// |
Proposal literature review due: Sept 26th
Coding Assignment 1 due: Sept 30th
|
5 |
Tue: Sept 19 |
Containers I
|
Namespaces in operation, CGroups documentation
|
VM lighter than container, Slacker |
|
6 |
Thu: Sept 21 |
Container II |
Docker docs, Docker architecture
|
Improving Docker registry |
Homework assignment 2: due date Oct 5th
|
|
Tue: Sept 26 |
Project Ideas Discussion
|
|
|
Proposal and literature review due Today
|
7 |
Thu: Sept 28 |
Cloud Storage |
GFS Paper,
HDFS Paper
|
Hadoop Architecture Guide |
|
8 |
Tue: Oct 3 |
Consistency Models |
Visual Guide to NoSQL Systems, Delta Store (Meta), BespoKV paper
|
Chain Replication, CRAQ |
|
9 |
Thu: Oct 5 |
Key-Value Stores |
Scaling Memcached at Facebook, DynamoDB
|
// |
|
10 |
Tue: Oct 10 |
Programming Model I |
MapReduce, Spark
|
Hadoop vs Spark, MapReduce Architecture, Replex |
|
11 |
Thu: Oct 12 |
Programming Model II |
Pocket paper
|
A Berkeley View on Serverless Computing, A Berkeley View of Cloud Computing, The Wukong paper
|
Mid Project discussions and presentations start next week.
|
|
Tue: Oct 17 |
Patient API, Incident Migration, AutoScale KV store
|
Project Discussions
|
|
|
|
Thu: Oct 19 |
Harmful Post Detection, Distributed Fuzzy Inference
|
Project Discussions
|
|
|
|
Tue: Oct 24 |
Incentivized Collaboration, Distributed Movie Recommendation, Model Splitting for Distributed ML
|
Project Discussions
|
|
|
12 |
Thu: Oct 26 |
Serverless Caching |
InfiniCache, InfiniStore
|
Pocket, AWS Lambda
|
Coding Assignment 2: due Nov 5th |
13 |
Tue: Oct 31 |
Serverless Storage + Parallel Processing |
InfiniStore, Wukong
|
Hacker News, Review |
|
14 |
Thu: Nov 2 |
Serverless Parallel Processing + Cloud Control Operations |
Wukong, CNSBench
|
Wukong Docs |
|
15 |
Tue: Nov 7 |
MapReduce Scheduling |
SPARK, hatS
|
MapReduce Heterogeneity |
|
16 |
Thu: Nov 9 |
Coding Assignment 2 design discussion + Demo + Cloud Resource Management I |
Mesos,
MOS
|
Mesos Video |
|
17 |
Tue: Nov 14 |
Cloud Resource Management II |
Google Borg, Borg: the next generation, Alibaba Trace Analysis, Borg to Kubernetes
|
Alibaba Microservice Trace Analysis, Borg Video |
Coding Assignment 3: due Nov 20th |
18 |
Thu: Nov 16 |
Deep Learning in Cloud |
Horvod, Ray
|
Why Ray?, Spark Summit,
Parameter Server
|
|
19 |
Tue: Nov 21 |
Deep Learning in Cloud II / Federated Learning as a Service |
Alpa, IBM FL, TiFL
|
AlpaServe |
|
|
Thu: Nov 23 |
Thanksgiving Break |
|
|
|
20 |
Tue: Nov 28 |
Communication in FL
|
FedAT
|
OORT |
|
|
Thu: Nov 30 |
Patient API, Incident Migration, Auto Scale KV Store
|
Project Presentations
|
|
Homework Assignment 3: due Today
|
|
Tue: Dec 5 |
Harmful Post Detection, Distributed Fuzzy Inference, Incentivized Learning
|
Project Presentations
|
|
|
|
Thu: Dec 7 |
Model Splitting for ML, Online Movie Rec.
|
Project Presentations
|
|
|
|
Tue: Dec 12 |
Final Report Discussion |
|
|
|
Final Project
The research project can be done in a group of 2 to 3 students. The project will be judged on the following criteria:
- Proposal and literature review (10%)
- Midterm presentation (10%)
- Final presentation (10%)
- Project Report (20%)
Project Proposal Report
Recommended structure for the Proposal Report is:
- Title + abstract.
- Related work: Need to have ~15 references. Reference could be papers, websites (e.g., GitHub repos, Internet articles, blog entries, etc.).
- Motivation + concrete problem statement (need not be formal, only concrete).
- A brief description of your expected approaches/system_design/experiment_design to solve the problem.
- Timeline/Expected milestones for achieving your goals.
Mid-way Project Presentation
You should present the current progress of your final project including design, related work review, any results if you have some (to show motivation or evaluation).
Final Project Presentation
You should present the final project including design, related work, and final results to showcase your project.
Project Final Report
The final deliverable of this course is a conference-quality paper. Focus on the following Questions 0 through 5 (an extended version of Question 0 through 3 in the Checkpoint Report requirement):
- Question 0: What is/are your main hypothesis/hypotheses in the project? This is a one-sentence summary of your paper. Position your paper w.r.t. other related work on the same or even similar problems (reuse relevant parts of your survey report here!). Show to the reader that your work is on a unique point in the design spectrum. Don’t overstate it, don’t understate it either. This statement is related to Question 5 below (yes, it’s a chicken and egg problem!)
- Question 1: What are your goals?
- Question 2: What are your techniques?
- Question 3: What do your techniques gain you?
- Question 4: What do your techniques lose you? That is, what are the tradeoffs.
- Question 5: What do your experiments tell you? (This is related to Question 0, 3, 4 above, and again, it’s an egg and chicken problem!) Show that your data supports your hypothesis, and show the gains, losses, and tradeoffs by using your techniques.
As well as pay attention to:
- Importance of problem,
- Novelty of solution,
- Evaluation of solution,
- Clarity of Presentation,
- Nits (grammar, references, etc.).
It is common that the implementation or experimental methodology gets adapted in the course of your investigation. And therefore, given that this is your final write-up, please make sure to refine/change/expand based on your mid-way presentation feedback to reflect your latest implementation & evaluation activities.
Leaderboard - KV performance
Group Members |
Throughput 3 KV Stores |
Average Latency |
Ranking |
Tejasvi Bansal, Rishabh Agarwal |
7362 |
3.1 ms |
|
Ammar Ahmed, Connor Howe |
8240 |
2.87 ms |
|
Xinran Wang, Jiaxiang Tang, Minrui Tian |
7900 |
3.20 ms |
|
Jon Meshesha, Andrew Owens |
4456 |
3.53 ms |
|
Pat Johnson |
9572 |
3.1 ms |
|
Leo Dong, Michael Andrev, Apekshik |
30332 |
1.43 ms |
2nd |
Sahil Raina, Jingxian Chai, Shunichi Sawamura |
25588 |
2.9 ms |
3rd |
Aahan Tyagi, Hemanth Kumar, Vishal Kancharla |
83865 |
3.2 ms |
1st |