The purpose of this course is for students to learn how to engage in the scientific process using data-centric concepts and methods and to think like a data scientist by critically analyzing their own work and the work of others.
It is my goal that after completing this course successfully, you will be able to:
Explore a data set to determine whether and how it might illuminate questions of interest.
Define and operationalize a research question such that a data analysis could produce meaningful knowledge.
Use best practices to carry out analyses in a documented, reproducible, and efficient fashion.
Present the results of a data analysis with appropriate visuals and written argument.
Identify weaknesses in a data analysis and assess their impact on the correctness and utility of the results.
Assess ethical implications of an analysis in terms of both classical human subject research ethics and contemporary concerns such as fairness and bias.
Understand the space of data science techniques and applications, and relate future learning to this framework.
- Course Title
CS 533: Introduction to Data Science
- In-Person Schedule
Tuesdays and Thursdays 9:00–10:15 AM in CCP 221
- Course Website
Canvas (private links, grades, and assignments)
- Course Discussion
I am Michael Ekstrand, an assistant professor in the Dept. of Computer Science.
mailto:firstname.lastname@example.org (but please use Piazza for non-grade class questions)
- Office Hours
Tue 10:30–11:30 AM
Fri 10—11 AM
Online by appointment
I generally respond to course questions during normal hours (9a–5p M–F). I may occasionally reply to a question in an evening or on the weekend, but do not plan on it.
COVID-19 syllabus notice¶
This section is a required syllabus notice provided by Boise State University. For information about the plans and accommodations for COVID that I have designed into this specific class, see Coronavirus.
Many Boise State classes have resumed face-to-face meetings in the midst of a global pandemic and a recent local surge of infections. Our goal is to have a successful academic year while keeping our students, faculty, and local community healthy and safe. Public health requirements are in place to achieve that goal, the primary mechanism for which includes the mandatory use of facial coverings that protect all of us.
We have taken health precautions on campus so that you can have the option of a face-to-face course. However, there is still inherent risk associated with face-to-face courses during a pandemic because of proximity to others and length of potential exposure to the virus. Therefore, as members of this learning community, it is imperative that we all engage in behaviors that protect the overall public health.
You have enrolled in a face-to-face course, and this format offers a number of benefits that appeal to many students. In order to preserve your access to this face-to-face option you are required to
sit in the same seat all semester (for purposes of contact tracing) and
wear facial coverings in all face-to-face learning environments. You must keep your mouth and nose covered at all times throughout class — facial coverings cannot be pulled up or down. As a health precaution, eating and drinking are NOT permitted in the classroom.
By enrolling in an in-person course, you agree to comply with Boise State’s rules and precautions which include, but are not limited to, facial coverings, frequent hand washing, hand sanitizing, and sitting in the same seat all semester. Failing to comply with these rules and precautions is a violation of Boise State’s Student Code of Conduct and will subject you to university sanctions and discipline.
Seating assignment will be based on learning teams, and will be effective the second day of class after learning teams are formed.
University policy states that I am not allowed to begin/continue with instruction unless and until everyone present has a facial covering in place.
This course is designed to be accessible to all students. A very small percentage of people cannot wear facial coverings for reasons related to medical conditions or disabilities. If this is your experience, please contact the Educational Access Center to document your condition so that we may determine the best accommodation for you. Until an accommodation is in place, you will need to participate remotely. If you need to read lips or facial expressions to understand what people are saying, please let the Educational Access Center and me know via email.
If you are unwilling to wear a facial covering, you cannot participate in person. If this is the case, please dismiss yourself and either inquire whether you may participate in the class fully remotely, or contact the Registrar’s Office (208-426-4249) to pursue your learning experience in a different remote or online section. Should you refuse to cover your mouth and nose and also refuse to leave the classroom, I have been directed to dismiss the class and you will be reported to and contacted by the Dean of Students Office.
Mutual Guidelines for Safe Learning Environments¶
While these public health measures are essential to protecting our individual and communal health, they also complicate how we engage in teaching and learning. The following guidelines should ease our comfort and communication with one another:
In the classroom, we must wear a facial covering that covers our mouth and nose at all times. If you or I let our facial coverings slip, we will politely remind one another to secure our masks.
Facial coverings muffle voices. I will use the classroom microphone to amplify my voice through my mask. In addition, I will repeat your questions and summarize comments to ensure we all can follow any discussion.
Resources and Readings¶
The Resources page on the course web site has a more complete list of resources that I will update throughout the semester.
Our primary textbooks are:
Python for Data Analysis (2nd Edition) by Wes McKinney (O’Reilly, ISBN 978-1491957660)
Think Like a Data Scientist by Brian Godsey (Manning, ISBN 978-1633430273)
If you want a more thorough treatment of the core Python language traditional book format, I recommend:
Learn Python the Hard Way by Zed Shaw
Throughout the semester, I will assign various readings from the Internet and research papers. These will be posted to to the course web site.
We will be using Python with the PyData tools (Pandas, Numpy, Scipy, matplotlib, Seaborn, etc.). The easiest way to install the required software is to install Anaconda Python. The various Python libraries we use each have their own documentation.
I will not provide support for debugging Python installations other than Anaconda (and other Conda distributions, like miniconda and miniforge).
Further information about software, and links to documentation, can be found in the course resources.
This class uses a flipped-classroom design. I will primarily not be lecturing in class; instead, content delivery is asynchronous through the following resources:
Accompanying notes (on the course web site)
Other readings linked from the lecture notes and course web page
Each week has approximately 75–90 minutes of video material, plus some reading. There will be a short quiz before each Thursday’s class as an initial check on your understanding of the material. Weeks with exams will have lower video and reading loads.
Our in-person class time will be for discussing the course material and topics, additional mini-lectures to supplement your understanding of the course, and team-based exercises to practice the material with ready access to peer and instructor support.
Putting these together, along with the larger assignments, results in the following components of the class:
Reading and study
Work in class
Your final grade will be computed from the course components as follows:
The standard 70/80/90 scale determines the minimum grade you will receive (that is, if you have 80 total course points, you will receive at least a B-).
I expect you to actively participate in our class sessions. Since these are interactive working sessions, if you have a laptop please bring it to class.
The in-person class sessions are based on the principles of team-based learning, adapted to this class’s needs and role in the graduate curriculum. You will be working with a group of your peers throughout the semester helping each other practice the material, discuss and apply it to small exercises and examples, and identify places where you need and want to learn more.
On Thursdays, our class will look like a “normal” team-based learning class — a team quiz (the Readiness Assessment Process), supplementary content discussion as needed, and an application exercise. The TBL readiness assessment process normally consists of two parts: an individual quiz, followed by re-taking the quiz as a team to discuss and improve your answers. In this class, we will be implementing that with an online individual quiz due before class and an in-person team quiz (see Quizzes).
Tuesdays will be more varied. The first Tuesday will be the class introduction and overview, and our exams will be on Tuesdays. On weeks when an assignment is due (see the schedule), we will use the class period for collaborative problem-solving about the assignment. Other Tuesdays will be for more extended discussion and application exercises depending on where we are in the semester.
In-class work will contribute to your grade.
One final note on classes: taking care of our health, individually and as a class, is top priority. While I aim for every class to be a meaningful, can’t-miss learning experience, I also want us to have a general expectation that if we’re ill, we stay home, both to recover ourselves and to protect our colleagues’ health. I’ve designed the grading policies to help with this (see Coronavirus), but if we need to further adjust to accommodate the semester’s health demands, we will. If you need to miss class, I encourage you to phone in and have a teammate put you on speakerphone during your team’s activities, if you are feeling well enough; this will allow you to contribute to the team quiz and work.
There is a short weekly individual quiz in Canvas, due before class on Thursday (at 8AM, so I can look at results before class), on the readings and videos. The purpose of these quizzes is to help make sure you are prepared for applying the material, and to give both you and I early and frequent checks on your understanding.
In class on Thursdays, we will take the team quiz, which will usually be the same or very similar as the individual quiz. This is an opportunity to refine your understanding of the material and collaboratively fill in gaps you may have missed. Individual and team quizzes are weighted equally in your final grade.
For both individual and team quizzes, only the 10 highest scores will contribute to your grade.
There will be 8 homework assignments practicing data science techniques in Python. Each assignment is due at midnight on Sunday of the week in which it is due.
I have scheduled this due date to give you the weekend to finish the assignment if that works best with your schedule. However, as documented above, I do not commit to checking Piazza on the weekends to watch my time, and therefore you should work on the assignment early enough that you can raise questions and have them resolved before the weekend is over.
The first assignment (A0) is a warm-up assignment to make sure that you can install the software and run Python notebooks. You must complete this assignment individually.
The other assignments (A1–7) are full assignments doing data science with Python. You may do up to 3 of these assignments with a partner, and must complete 4 individually. You may choose which assignments you solo and which are a group effort. When doing an assignment with a partner, submit one copy for both of you, and indicate your partner’s name in the Blackboard submission comments.
I will drop the lowest assignment grade.
Exams and Final¶
There will be two midterm exams and a final.
There will be a makeup exam available the last week of class. If you turn in the makeup exam, your grade on it will replace your lowest normal exam grade.
Our work within this structure is governed by the following policies, in addition to applicable university policies and regulations and general principles of academic and scholarly integrity.
Web Site and Announcements¶
I will use Piazza for all course communication, including announcements. Please make sure your Piazza notifications are set correctly, so that you are notified of important announcements.
I will sometimes need to update assignments after I have issued them. When this is necessary, I will include a revision log at the top of the assignment describing the changes, and will make a course announcement regarding the change. I will also state whether the revision changes a requirement (this is rare), or clarifies a requirement.
For the assignments, you have a budget of 4 late days to use throughout the semester, at your discretion. Each late day extends an assignment deadline by 24 hours with no penalty; late days are indivisible, so submitting an assignment 12 hours late requires an entire late day. You may use up to 3 late days on a single assignment.1 When submitting an assignment using a late day, state with your submission the number of days you are using. I appreciate it if you notify the TA and I (via a Piazza private message) prior to the deadline that you are planning to submit late, but do not require you to do so.
This policy, combined with dropping the lowest assignment grade, is designed to accommodate most ordinary need for extensions or late submissions. Therefore, exceptions beyond this policy will not generally be granted; any requests for individual exceptions must be submitted in writing (by e-mail or Piazza) so that I have a record of the request and my response.
Exams will be at the published times. The makeup exam is the ordinary accommodation for not being able to take the exam when scheduled.
Cheating and Academic Integrity¶
As both a scientist and a student, you are expected to do your own work, attribute sources, and respect the legal and moral rights of others with respect to their work; as a student, you are also required to abide by the Boise State University Student Code of Conduct. While I aim to allow you to make reasonable use of resources, cheating (including copying code, using unauthorized resources during tests, etc.) is not ok. If you are found to be cheating, the penalty may range from an F on the assignment to an F on the course and will generally be reported to the university.
I expect you to behave in a civil, respectful manner in all class interactions and to contribute to a constructive learning environment.
The Recurse Center Social Rules are a good source of guidance on how to maintain a constructive and educational environment.
If you experience or witness harassment of any form, please let me know.
If you need particular accommodations or support to be able to fully participate in this course, please talk with me as soon as possible by e-mail or in office hours. If you have documentation from Disability Services authorizing specific accommodations, please bring it; however, a documented disability is not necessary for me to be willing to talk with you about how to make the course work for you.