Skip to content

aquanauts/streaming_interview

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Aquatic's Streaming Python Test

Thank you for taking the time to apply for a position at Aquatic Capital Management. Please solve the problem described below, and submit your solution. We expect that solving it will take you an hour or two. Here are some criteria we'll use to evaluate your solution:

  • Is the program correct? Does it do everything we specified in the problem statement?
  • Did you add any unnecessary complexity?
  • Are your changes to the code well tested? Do your tests pass?
  • Can you maintain good project hygiene, treating this like "real world" software that the engineers at Aquatic would want to maintain in perpetuity?
  • Is the program performant? Does it avoid excessive memory usage?

If your solution meets all of these criteria, we may follow up with a second interview. In this second interview, we'll pair you up with another Aquatic engineer to review your solution and make further changes to it.

The Problem

Included in this workspace is a Python program designed to aggregate a stream of samples from weather stations on Chicago city beaches into messages that provide snapshots of the aggregated state of the weather. You should assume this program will run as part of a larger system that delivers messages in JSON format, one object per line, to the STDIN of the program. The program should output JSON, one line per object, to STDOUT. The reading and writing functions have already been implemented for you.

Your goal in this interview is to change the program to correctly process these messages, and produce snapshots of the state when requested. The incoming messages types are heterogeneous, and can include both samples from weather stations, and control messages.

Sample Messages

Weather samples have the following keys:

  • type - The message type, as a string. "sample" for weather samples
  • stationName - A human-readable string identifying the weather station
  • timestamp - A UTC millisecond precision timestamp representing when the sample was taken, as an integer number. The sample below represents Jan 1, 2023 12:00:00 AM GMT
  • temperature - The floating point Fahrenheit temperature
Example message
{
  "type": "sample",
  "stationName": "Foster Weather Station",
  "timestamp": 1672531200000,
  "temperature": 37.1
}

Control Messages

In addition to weather sample messages, there are also control messages. These messages instruct your program to print out various statistics about the data ingested so far. The output must be valid JSON, one line per object, with a single newline character representing the end of line.

Snapshot

The primary control message is snapshot. The snapshot message is a request for your program to output the high and low temperatures observed at each weather station.

Example:

{
  "type": "control", 
  "command": "snapshot"
}
Output Format

When your program receives a snapshot control message, it should output a JSON object with the following fields:

  • type - The output type ("snapshot" in this instance)
  • asOf - The most recent weather sample timestamp received at the point when the snapshot or aggregation was taken. Data with a timestamp later than this time should not be included in the output. All data with a timestamp equal to or before this timestamp must be included.
  • stations - A object that uses station names as keys, with object values that contain high and low temperature values

Example:

{
  "type": "snapshot",
  "asOf": 1672531200000,
  "stations": {
    "Foster Weather Station": {"high": 37.1, "low": 32.5}
  }
}

Reset

When your program receives a reset control message, it should drop data associated with all weather stations.

Example:

{
  "type": "control", 
  "command": "reset"
}

In response to this, your program should output a message confirming that the data has been reset. This message should include the following fields:

  • type - The output type (reset in this example)
  • asOf - The most recent weather sample timestamp received at the point when the reset occurred. Data received at or before this timestamp should not be included in any subsequent snapshot responses.

Example:

{
  "type": "reset",
  "asOf": 1672531200000
}

Important Details

  • Do not change the signature of the process_events function in the weather module. This is used to grade your solution.
  • If the program encounters an unknown message type, it should raise an informative exception
  • process_events must remain a generator which lazily evaluates the input. Do not hold all the input data in memory.
  • You should handle messages in any order. Control messages received when no sample data is present (at program start or after a reset) should be ignored.
  • Before submitting your solution, be sure to that all tests and linters pass by running make test. You can use make watch to run this process continuously as you work.

About

Our Streaming Python Interview Question

Resources

License

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages