Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How do I return a simple count of instances where the query is true? #32

Closed
BlackArbsCEO opened this issue Jul 23, 2014 · 8 comments
Closed

Comments

@BlackArbsCEO
Copy link

For example how to return a count of the total number of 'plays' by a 'team' in a 'game' or 'drive'. In other words, count the number of items returned in the list.

I'm sure there is a simple solution but I have not been able to find one. Even though the nfldb objects are lists, the Python builtin functions I've tried like ['count'] have not worked.

Any help would be massively appreciated. Thank you.

@BurntSushi
Copy link
Owner

import nfldb

db = nfldb.connect()
q = nfldb.Query(db)

q.game(gsis_id='2013090800')
print len(q.as_plays())

@BlackArbsCEO
Copy link
Author

Thanks!

@BurntSushi
Copy link
Owner

@BlackArbsCEO Sorry for the curt response, but I was distracted with work.

There are two things I wanted to add. Firstly, in Python, the count method on lists returns the number of items equivalent to the value given. For example, ['a', 'b', 'c', 'a'].count('a') returns 2 whereas len(['a', 'b', 'c', 'a']) returns 4.

Secondly, if you care about performance, you would want the database to do the counting for you. When you use len(q.as_plays()), that has to pull all of the data from the database and load it into Python values. That is very expensive if you just want a count. If you needed better performance, you could use raw SQL to get a quicker count. Let me know if you want me to help you with that.

@BlackArbsCEO
Copy link
Author

That would be awesome and much appreciated! Specifically I'm looking to learn how to generate queries related to the following:

-a count of total plays run ( per off/def possession, per game[including special teams], or by a particular player in a situation )

-aggregated and unaggregated stats for situational plays ( ex. most rushing/passing attempts inside opponent redzone, most redzone targets etc. )

@BurntSushi
Copy link
Owner

@BlackArbsCEO I'll try to get to that tonight. But before I do, could you elaborate a bit more on your performance requirements? Dropping to SQL really isn't something you should do unless you know it needs to be absolutely as fast possible. (Like, if you're running a web site.)

If you're just doing some basic analysis, then you shouldn't have to touch SQL at all. Everything can be done through the query interface. Have you taken a look at the wiki yet?

@BlackArbsCEO
Copy link
Author

I have looked at the wiki but have yet to decipher some of the most effective ways to request the aforementioned queries. I have spent ~5 days getting to know the module after stumbling onto the nflgame module first, but have not come close mastery yet.

Speed and efficiency is great but at this point a ~5 sec query vs 1 sec isn't make or break. Using the python interface is perfectly acceptable if it's simpler.

A specific example of the issues I've had thus far in terms of added complexity is aggregating 'rushing_att' inside the red-zone into a top 10 list with player names, and cumulative attempts. I got stumped after using your Megatron example as a basis to return all the 'plays' where 'rushing_att' greater than 1 and inside the oppenent's redzone but not in the player, stat format common in your aggregate examples.

@BurntSushi
Copy link
Owner

@BlackArbsCEO Perfect. OK, I have a better idea for what to go through. I'll probably make it a wiki article or something.

And yeah, the megatron example is a bit tricky. I think I may have made it too clever. I think there exists a longer version that is clearer.

@BlackArbsCEO
Copy link
Author

That's awesome! Let me know if I can help out by providing feedback or anything of that nature.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants