Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AssertionError: The key 'team' does not exist for entity 'Game'. #34

Closed
roodhouse opened this issue Jul 30, 2014 · 8 comments
Closed

AssertionError: The key 'team' does not exist for entity 'Game'. #34

roodhouse opened this issue Jul 30, 2014 · 8 comments

Comments

@roodhouse
Copy link

Hi, it's me again.

I wasn't scared off. I am trying to work my way through the wiki page.

Following the examples from "an introduction to the query interface" I type in this:

import nfldb
db = nfldb.connect()

q = nfldb.Query(db)
q.game(season_year=2013, season_type='Regular', team='NE', week=1)
for d in q.as_drives():
print d

And this error returns:

Traceback (most recent call last):
File "C:\Users\Rugh\Desktop\nfldb\test.py", line 5, in
q.game(season_year=2013, season_type='Regular', team='NE', week=1)
File "C:\Python27\lib\site-packages\nfldb\query.py", line 586, in game
_append_conds(self._default_cond, types.Game, kw)
File "C:\Python27\lib\site-packages\nfldb\query.py", line 202, in _append_conds
% (kbare, entity.name)
AssertionError: The key 'team' does not exist for entity 'Game'.

Each time I try to use 'team' in q.game I get this message.

What am I missing?

@ochawkeye
Copy link
Contributor

I don't think you're missing anything. I noticed the same issue in the sample code provided.

Try:

q = nfldb.Query(db)
q.game(season_year=2013, season_type='Regular', week=1)
q.play_player(team='NE')
for d in q.as_drives():
    print d

I know there is probably a better way there than q.play_player, but it seems to work. @BurntSushi can probably explain

What am I missing?

Github markdown FYI: If you add

#```python

and

#```

around your code (minus the pound signs) that you post here, it will make it pretty.

@BurntSushi
Copy link
Owner

Dammit. This is a genuine bug. I got overzealous with error checking in the last release. Nice find!

@ochawkeye
Copy link
Contributor

I'm going to change my answer here - something appears to be wrong. Almost every example gives a q.game(team='NE') methodology...

@BurntSushi
Copy link
Owner

P.S. @ochawkeye Have you see the write up for the nfldb 0.2.0 release? https://github.com/BurntSushi/nfldb/releases/tag/0.2.0 (TL;DR I completely re-wrote nfldb.query.)

@BurntSushi
Copy link
Owner

OK, this is fixed.

@roodhouse You will need to run pip install --upgrade nfldb and then the examples should work. Sorry about that!

@ochawkeye P.S. Check out the fix. We have regression tests now. :-)

@roodhouse
Copy link
Author

Thanks guys. Time to test my newly learned skill. If I come back here
then you know what happened..

On Wed, Jul 30, 2014 at 5:46 PM, Andrew Gallant [email protected]
wrote:

OK, this is fixed.

@roodhouse https://github.com/roodhouse You will need to run pip
install --upgrade nfldb and then the examples should work. Sorry about
that!

@ochawkeye https://github.com/ochawkeye P.S. Check out the fix. We have
regression tests now. :-)


Reply to this email directly or view it on GitHub
#34 (comment).

@ochawkeye
Copy link
Contributor

Out of curiosity, how expensive is a database query as opposed to iterating over the data redundantly?

q = nfldb.Query(db)
q.game(season_year=2013, season_type='Regular', week=1, team='NE')
q.drive(pos_team='NE')
for d in q.as_drives():
    print d

q = nfldb.Query(db)
q.game(season_year=2013, season_type='Regular', week=1, team='NE')
q.drive(pos_team='BUF')
for d in q.as_drives():
    print d

vs.

q = nfldb.Query(db)
q.game(season_year=2013, season_type='Regular', week=1, team='NE')
for d in q.as_drives():
    if d.pos_team == 'NE':
        print d
for d in q.as_drives():
    if d.pos_team == 'BUF':
        print d

Running first sample 100 times:

Execution time for 100 cycles = 2.66303673806
Average per cycle = 0.0266303673806

Running second sample 100 times:

Execution time for 100 cycles = 4.58160582431
Average per cycle = 0.0458160582431

But repeating tests gives drastically varying times. Any thoughts on this?

@BurntSushi
Copy link
Owner

@ochawkeye As a general rule of thumb, the more you put into a database query and the less work you do in Python, the faster it will be. I think that is consistent with your tests because you're doing more work in Python in the second example.

I would suggest removing week=1 and doing the test over an entire season to get stabler numbers.

Also, your tests should not have print in them. That could also be a source of instability between test runs. Maybe do a sum of something instead.

(If you wait, I'll do it when I get home later today.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants