CS 4984 - Data Science & Analytics Capstone
Spring 2019

Projects

Term Project (75%) - Group

The goal of the final project is to identify an interesting question or problem on online social media platforms that you can address by analyzing online data. The papers and practicums discussed in class should help you along the way. All project topics must come from one of the following two themes:

  • Theme 1: News and misinformation
  • Theme 2: Hateful and offensive content

Project topics must be approved by the instructor. You need to justify that the key question that your project is addressing is relevant to the course, relevant to the theme, and is of suitable difficulty. Your project should have some some non-trivial analysis/algorithms/computation/experimentation (e.g., computing basic statistics, like average, min/max will not be enough). A list of high-level suggested topics for projects will be made available. While you are free to implement the project in the programming language of your choice, I will highly recommend using python for analysis-style projects. You will use Jupyter notebooks to present your data analysis and for any in-class project discussions. Implementing your term project has multiple graded components starting with the project proposal and ending with the final presentation and report submission.

Once you have selected a topic, you should do some background reading so that you are capable of describing, in some detail, what you expect to accomplish. For example, if you decide that you want to implement some new proposal for detecting hateful content on the social network Gab, you will have to carefully read papers that addresses this problem, pinpoint their weaknesses, or come up with new suggestions based on what you read and explain how your approach will address these weaknesses or is a good alternative. Once you have read up on your topic, you will be ready to write your proposal.

  • Teaming: The work will be carried out in teams of 3 to 4 people. In order to help with team formation and identify people of similar interests and complimentary skills, you will introduce yourself to your classmates by writing a short intro about yourself. Please do this in the first week of class, so that you have enough time to form teams. Your intro should include:
    • Your name
    • Major and year
    • Project interests
    • Relevant skills
    • Anything else that you think is relevant for your team formation.
  • Weekly Updates: Each group should write up a weekly update to share with the rest of the class and the instructor. Weekly updates should answer the following three questions.
    • What did you do this week?
    • What are you planning to do next week?
    • What problems are you facing?

    For each question list your answers as numbered bullet points, which means you should spend at max 5 to 10 minutes summarizing your progress. Better yet, you can also do this in-class while your group is working on the project. If there was no progress in a particular week, say so, and state why. Now you have to spend more time. Do you see the catch here? So why not just consistently make progress every week and spend minimal time on weekly updates.

    These will be graded for completeness and will count towards your milestone grade. The aim of the updates is to ensure that the group stays on track and that any roadblocks can be detected and solved at an early stage. It is also an opportunity for your classmates to provide solutions to the last question – what problems are you facing?

Project Proposal (5%) - Individual & Group (in phases)

A successful project goes go through the classic three steps of – idea generation, idea selection, and implementation. In choosing your final project, you will go through this multi-step process. Several of these will be conducted in-class with your peers. Below are the steps that you will need to follow:

  • Proposal Idea generation (Individual) (submit on Canvas – graded for completeness) – Again, note, the word – completeness. You cannot do a sloppy job and expect a grade. For your idea generation phase, you should propose a minimum of two ideas for your project. You are free to propose more, but your submission should be limited to 2-pages, 12-point font. Each of your ideas should clearly answer three questions and mention the topic of your idea in a few short keywords:
    • What problem in a social computing system are you trying to study and/or address? (max 1 or2 sentence)
    • What do you want to do? (A bit more detail without any technical jargon. Focus on the problem, not on the method at this stage).
    • Why should we care?
    • Keywords – To mark topic and domain of the idea.

Your proposed ideas will be reviewed by your peers during an in-class lab activity. Please print copies of your proposed ideas on the feedback day. I will try my best to match peers based on the keywords that you mention. This is your chance to get supportive helpful feedback on early stage ideas. Be open about candid discussions and weaknesses of your proposed ideas. Plus, this is also your chance to form teams for your group projects and identity students with similar interests.

  • Feedback on Ideas (Individual) (in-class activity – graded for completeness) – Providing feedback is a crucial activity that you will experience throughout your career. This is your chance to practice providing helpful constructive feedback. VERY IMPORTANT NOTE: A key thing that you should remember while giving feedback is that your feedback is not meant to make yourself look smart or show-off in any way. Your feedback should be helpful, constructive and respectful towards your peers.

  • Final Proposal (Group) (submit on Canvas – graded as per grading rubric) – Based on the feedback received and the discussions that you have had, you will form your project team and propose your final project proposal. Your final proposal can be based on one of your earlier proposed ideas, or be a mashup of you and your team-members’ ideas, or different from your initially proposed ideas. Consult with me before finalizing. Once you have selected a topic, finalized your project team, you should do some background reading so that you are capable of describing, in some detail, what you expect to accomplish. Once you have read up on your topic, you will be ready to write your proposal. Your proposal should be fewer than 1000 words, excluding titles, section names, reference list, etc., but including the literature survey. It should use 12pt font, typed in PDF format (can be created using any software, e.g., LaTeX, Word), and with figures, tables, etc. whenever useful. 

  • Guidelines for Project Proposal: Your project proposal should include answers to all of Heilmeier’s questions, as well include the name of your project and team members. Your proposal should be fewer than 1000 words, excluding titles, section names, reference list, etc., but including the literature survey. It should use 12pt font, typed in PDF format (can be created using any software, e.g., LaTeX, Word), and with figures, tables, etc. whenever useful.  So, your proposal should include the following:
    • The name of your team. Something memorable which resonates with what you are trying to do.
    • The name of all team members.

    All 9 Heilmeir questions (wikipedia source):

    1. What are you trying to do? Articulate your objectives using absolutely no jargon. What is the problem? Why is it hard?
    2. How is it done today, and what are the limits of current practice?
    3. What’s new in your approach and why do you think it will be successful?
    4. Who cares?
    5. If you’re successful, what difference will it make? What impact will success have? How will it be measured?
    6. What are the risks and the payoffs?
    7. How much will it cost (in terms of people, resources, etc.)?
    8. How long will it take?
    9. What are the midterm and final “exams” to check for success? How will progress be measured?
  • Project Pitch: Each team should present a pitch to the class stating the problem they are working on. It should focus on the why, what and how questions - Why is it needed? Who wants or benefits? What are you planning to do? How will you do it?  What skills and tools are needed? You can also follow the questions that you answer in your proposal to prepare your pitch presentation.

  • Project Proposal: You should also prepare a project proposal including answers to all of Heilmeier’s questions, as well include the name of your project and team members. Your proposal should be fewer than 1000 words, excluding titles, section names, reference list, etc., but including the literature survey. It should use 12pt font, typed in PDF format (can be created using any software, e.g., LaTeX, Word), and with figures, tables, etc. whenever useful.  So, your proposal should include the following:
    • The name of your team. Something memorable which resonates with what you are trying to do.
    • The name of all team members.

    All 9 Heilmeir questions (wikipedia source):

    1. What are you trying to do? Articulate your objectives using absolutely no jargon. What is the problem? Why is it hard?
    2. How is it done today, and what are the limits of current practice?
    3. What’s new in your approach and why do you think it will be successful?
    4. Who cares?
    5. If you’re successful, what difference will it make? What impact will success have? How will it be measured?
    6. What are the risks and the payoffs?
    7. How much will it cost (in terms of people, resources, etc.)?
    8. How long will it take?
    9. What are the midterm and final “exams” to check for success? How will progress be measured?

Project Pitch (5%) - Group

Each team should present a pitch to the class stating the problem they are working on. It should focus on the why, what and how questions - Why is it needed? Who wants or benefits? What are you planning to do? How will you do it?  What skills and tools are needed? You can also follow the questions that you answer in your proposal to prepare your pitch presentation.

Project Midterm Checkpoint (5% midterm presentation + 10% midterm report) - Group

Midway through the project, each team will be required to make a midterm presentation and submit midterm report outlining their work. We will also hold additional checkpoints during the duration of the project. This will be your chance to show your progress or discuss any lingering doubts or questions.

[Practicum milestone checkpoints] (10%) - Group

There will be two milestone checkpoints for your project.

  • First milestone check happens after your project proposal and before your midterm presentation.
  • Second milestone check happens after your midterm report and before your final project presentation.

10% of your project’s grade involves showing significant progress at these two milestone checkpoints. Treat the milestone checkpoints as opportunities to share a compelling aspect of the data analytics capstone project that you are currently working on. Your practicum must include a technical component presented and should be presented via Jupyter notebook. The component should provide “key insights” detailing one or more steps in the data science workflow of your project. For example, it can be a feature or library that you are currently using for acquiring data for your project (data preparation phase) or a brief exploration of your dataset to show patterns that you are observing (data explore phase) or a predictive model demonstrating what you found in your data (analysis and modeling phase).

The goals are to help groups stay on track, to allow rapid knowledge and material sharing among students, to provide frequent opportunities for outside feedback, and to help groups reflect on their own processes and progresss.

Final Project Presentation (10%) - Group

  • Final Project Presentation: Each group will make their project in a final presentation towards the end of the semester summarizing their key findings. You need to make your presentation in an engaging way.

Final Project Poster (5%) - Group

  • Poster presentation at VTURCS Symposium: This is your chance to show off your excellent projects at the VTURCS Spring Symposium and win prizes. Check the VTURCS website for specific submission guidelines on their site. For Canvas submission, you only need to submit a pdf copy of your poster.

Final Project Report (15%) - Group

  • Final Project Report: Your final project deliverable is a project report clearly articulating the motivation, analysis and key takeaways. You also need to submit your codebase.
  • Peer Feedback: After submitting your final report, you will be evaluating the work of your team members. The comments and ratings provided by your team members are key towards determining your final grade. So be nice to your team members, choose them wisely, and diligently do your share of work.