Your browser is not supported. Please download another browser to be able to use all of the Maven features.

Data Challenges

Introducing the Maven Taxi Challenge

5 min readView all articles
By Enrique Ruiz
Oct 12, 2021

Are you looking for a real-world business case to test your analytics skills?

Need a dashboard project to showcase on your portfolio?

We have just what you need, including another chance to win a free all-access membership to Maven Analytics!

We’ve just added a HUGE new data set to the Data Playground, containing records of 28 million Green Taxi trips in New York City.

This data set is perfect for data prep, profiling, and QA, and requires advanced analytics and visualization skills to bring it to life. So pick your favorite BI tool (Excel, Power BI, Tableau, etc.) and show us your skills!

To go along with this release, we’re launching the Maven Taxi Challenge and giving away a free all-access membership to the winner.

We’ll share more about the challenge details below, but first let’s talk about the data…

About the dataset

  • This dataset contains 6 tables in csv format, along with a geospatial map in TopoJSON and Shapefile formats
  • The 4 Taxi Trips tables contain a total of 28 million Green Taxi trips in New York City from 2017 to 2020. Each record represents one trip, with fields containing details about the pick-up/drop-off times and locations, distances, fares, passengers, and more
  • The 454 Calendar table contains a fiscal calendar (2017-2020) used by the Taxi & Limousine Commission, with fields containing the date and fiscal year, quarter, month, and week
  • The Taxi Zones table contains information about 265 zone locations in New York City, including the location id, borough, and service zone
  • The Taxi Zones Map files contain a map of New York City with divisions for the 265 locations that can be used to create custom map visuals in Power BI (TopoJSON) or Tableau (Shapefile)

How to play the Maven Taxi Challenge

For the Maven Taxi Challenge, you’ll be playing the role of a new Data Analyst for the New York City Taxi & Limousine Commission. It's your first week on the job, and you just received the following email from the Lead Dispatcher:

Welcome to the team!

We’ve been collecting trip data for ~4 years now, but without a proper analyst we haven’t been able to put it to good use. That's where you come in!

The raw data has some issues, so we'll need to make the following adjustments and assumptions to clean and prep the data:

  • Let’s stick to trips that were NOT sent via “store and forward”
  • I’m only interested in street-hailed trips paid by card or cash, with a standard rate
  • We can remove any trips with dates before 2017 or after 2020, along with any trips with pickups or drop-offs into unknown zones
  • Let’s assume any trips with no recorded passengers had 1 passenger
  • If a pickup date/time is AFTER the drop-off date/time, let’s swap them
  • We can remove trips lasting longer than a day, and any trips which show both a distance and fare amount of 0
  • If you notice any records where the fare, taxes, and surcharges are ALL negative, please make them positive
  • For any trips that have a fare amount but have a trip distance of 0, calculate the distance this way: (Fare amount - 2.5) / 2.5
  • For any trips that have a trip distance but have a fare amount of 0, calculate the fare amount this way: 2.5 + (trip distance x 2.5)

Once the data is cleaned up, I’m hoping you can build me a dashboard to help with weekly planning and logistics. For any given fiscal week, I'd like to be able to use historical data to answer the following questions:

  • What's the average number of trips we can expect this week?
  • What's the average fare per trip we expect to collect?
  • What's the average distance traveled per trip?
  • How do we expect trip volume to change, relative to last week?
  • Which days of the week and times of the day will be busiest?
  • What will likely be the most popular pick-up and drop-off locations?

I realize this is a lot to ask for, but this type of analysis will have a huge impact on our business!

Thanks in advance,

Mario Maven (Lead Dispatcher, NYC Green Taxis)

For this challenge, your task is to build a dashboard that meets Mario's requirements, and share a single page screenshot for any given fiscal week.

Here’s how to submit your entry:

  1. Share a LinkedIn post mentioning @Maven Analytics and the hashtag #maventaxichallenge, with your single page dashboard based on the challenge objective above
  2. Complete the official challenge submission form to make sure you are entered for a chance to win

Make sure to follow Maven Analytics on LinkedIn for updates on the challenge and invite your connections to play along!

How to win

Finalists will be chosen by the Maven team based on the number of dashboard requirements met, the accuracy of the final dashboard, and the overall dashboard design and visualizations. The winner will be selected from the finalist pool by the Maven team, via live voting.

What you can win

As part of the Maven Taxi Challenge, we'll be giving away 1 annual all-access Maven membership completely free of charge. The winner will get access to all of our Excel, Power BI, Tableau, and SQL courses, as well as our ongoing Machine Learning series, all with direct support and coaching from our expert instructors.

No purchase is necessary to participate. As always, this challenge is open to both current Maven subscribers and everyone else.

Check out the official rules below...

Official Terms (Maven Taxi Challenge)

  • Maven Analytics will give away 1 Annual subscription for an Individual account. After the one year period, the subscriptions will expire.
  • No purchase is necessary to enter.
  • Data Analyst Bootcamp is not included in this prize. That is a separate offering, and not eligible.
  • The challenge will close to new entries on Sunday, November 7th, 2021, at 11:59pm Eastern Standard Time. All entries must be submitted prior.
  • After the submission deadline, finalists will be chosen by the Maven team.
  • Maven Analytics will select 1 winner from the finalists, and will announce the winner on Tuesday, November 16th, 2021 at 10:00am Eastern Standard Time. The official announcement will be posted on LinkedIn from the Maven Analytics LinkedIn account.

Don't want to participate in the Maven Taxi Challenge? No problem. You can still analyze the data on your own. All of the Data Playground datasets are completely free and are available for everyone to learn with. Feel free to dig in anytime.

Check out the NYC Taxi Trips dataset, and many more, at the Data Playground.

Happy analyzing!

Subscribe

To get analytics tips & tricks delivered directly to your inbox.

Author

Enrique Ruiz

Enrique is a certified Microsoft Excel Expert and top-rated instructor with a background in business intelligence, data analysis and visualization. He has been producing advanced Excel and test prep courses since 2016, along with adaptations tailored to Spanish-speaking learners.

Enrique is a certified Microsoft Excel Expert and top-rated instructor with a background in business intelligence, data analysis and visualization. He has been producing advanced Excel and test prep courses since 2016, along with adaptations tailored to Spanish-speaking learners.

You May Also Like

Ready to become a

data rockstar?

Start learning for free, no credit card required!

Sign Up for Free