Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] [Crash Bug] SELECT * FROM <table> WHERE <condition> brings Crash #1342

Open
qwebug opened this issue Jun 6, 2024 · 0 comments
Open
Labels
bug Something isn't working needs triage Awaiting triage by a dask-sql maintainer

Comments

@qwebug
Copy link

qwebug commented Jun 6, 2024

What happened:

SELECT * FROM <table> WHERE <condition> brings crash, when using CPU execution.

What you expected to happen:

It will not bring crash, when using CPU execution.

Minimal Complete Verifiable Example:

Query by JDBC:

CREATE TABLE t0_testtable WITH ( location = '/tmp/t0.csv', format = 'csv', persist = True, gpu = FALSE );
CREATE TABLE t0_testtable_gpu WITH ( location = '/tmp/t0.csv', format = 'csv', persist = True, gpu = TRUE );
SELECT * FROM t0 WHERE (NOT t0.c0);

t0.csv:

c0,
'false',

SQL:

SELECT * FROM t0 WHERE (NOT t0.c0);

Result:

Query is gone (server restarted?)

Due to a crash caused by this query, the Dask-sql server restarted.

SQL:

SELECT * FROM t0_gpu WHERE (NOT t0_gpu.c0);

Result:

 c0 | Unnamed: 1 
----+------------
(0 rows)

Query by python:

import dask.dataframe as dd
from dask_sql import Context
import pandas as pd

c = Context()

df0 = pd.DataFrame({
    'c0':['false'],
})
t0 = dd.from_pandas(df0, npartitions=1)

c.create_table('t0', t0, persist = True, gpu=False)
c.create_table('t0_gpu', t0, persist = True, gpu=True)

print('GPU Result:')
result2= c.sql("SELECT * FROM t0_gpu WHERE (NOT t0_gpu.c0)").compute()
print(result2)

print('CPU Result:')
result1= c.sql("SELECT * FROM t0 WHERE (NOT t0.c0)").compute()
print(result1)

Result:

INFO:numba.cuda.cudadrv.driver:init
GPU Result:
Empty DataFrame
Columns: [c0]
Index: []
CPU Result:
Traceback (most recent call last):
  File "/tmp/test.py", line 19, in <module>
    result1= c.sql("SELECT * FROM t0 WHERE (NOT t0.c0)").compute()
  File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask/base.py", line 314, in compute
    (result,) = compute(self, traverse=False, **kwargs)
  File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask/base.py", line 599, in compute
    results = schedule(dsk, keys, **kwargs)
  File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask/threaded.py", line 89, in get
    results = get_async(
  File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask/local.py", line 511, in get_async
    raise_exception(exc, tb)
  File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask/local.py", line 319, in reraise
    raise exc
  File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask/local.py", line 224, in execute_task
    result = _execute_task(task, data)
  File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask/core.py", line 119, in _execute_task
    return func(*(_execute_task(a, cache) for a in args))
  File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask/optimization.py", line 990, in __call__
    return core.get(self.dsk, self.outkey, dict(zip(self.inkeys, args)))
  File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask/core.py", line 149, in get
    result = _execute_task(task, cache)
  File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask/core.py", line 119, in _execute_task
    return func(*(_execute_task(a, cache) for a in args))
  File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask/core.py", line 119, in <genexpr>
    return func(*(_execute_task(a, cache) for a in args))
  File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask/core.py", line 119, in _execute_task
    return func(*(_execute_task(a, cache) for a in args))
  File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask/core.py", line 119, in <genexpr>
    return func(*(_execute_task(a, cache) for a in args))
  File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask/core.py", line 119, in _execute_task
    return func(*(_execute_task(a, cache) for a in args))
  File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask/core.py", line 119, in <genexpr>
    return func(*(_execute_task(a, cache) for a in args))
  File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask/core.py", line 119, in _execute_task
    return func(*(_execute_task(a, cache) for a in args))
  File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask/core.py", line 119, in <genexpr>
    return func(*(_execute_task(a, cache) for a in args))
  File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask/core.py", line 119, in _execute_task
    return func(*(_execute_task(a, cache) for a in args))
  File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask/core.py", line 119, in <genexpr>
    return func(*(_execute_task(a, cache) for a in args))
  File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask/core.py", line 113, in _execute_task
    return [_execute_task(a, cache) for a in arg]
  File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask/core.py", line 113, in <listcomp>
    return [_execute_task(a, cache) for a in arg]
  File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask/core.py", line 119, in _execute_task
    return func(*(_execute_task(a, cache) for a in args))
  File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask/core.py", line 119, in <genexpr>
    return func(*(_execute_task(a, cache) for a in args))
  File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask/core.py", line 119, in _execute_task
    return func(*(_execute_task(a, cache) for a in args))
  File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask/utils.py", line 73, in apply
    return func(*args, **kwargs)
  File "/opt/conda/envs/rapids/lib/python3.10/site-packages/dask/utils.py", line 1105, in __call__
    return getattr(__obj, self.method)(*args, **kwargs)
  File "/opt/conda/envs/rapids/lib/python3.10/site-packages/pandas/core/generic.py", line 6240, in astype
    new_data = self._mgr.astype(dtype=dtype, copy=copy, errors=errors)
  File "/opt/conda/envs/rapids/lib/python3.10/site-packages/pandas/core/internals/managers.py", line 448, in astype
    return self.apply("astype", dtype=dtype, copy=copy, errors=errors)
  File "/opt/conda/envs/rapids/lib/python3.10/site-packages/pandas/core/internals/managers.py", line 352, in apply
    applied = getattr(b, f)(**kwargs)
  File "/opt/conda/envs/rapids/lib/python3.10/site-packages/pandas/core/internals/blocks.py", line 526, in astype
    new_values = astype_array_safe(values, dtype, copy=copy, errors=errors)
  File "/opt/conda/envs/rapids/lib/python3.10/site-packages/pandas/core/dtypes/astype.py", line 299, in astype_array_safe
    new_values = astype_array(values, dtype, copy=copy)
  File "/opt/conda/envs/rapids/lib/python3.10/site-packages/pandas/core/dtypes/astype.py", line 230, in astype_array
    values = astype_nansafe(values, dtype, copy=copy)
  File "/opt/conda/envs/rapids/lib/python3.10/site-packages/pandas/core/dtypes/astype.py", line 95, in astype_nansafe
    return dtype.construct_array_type()._from_sequence(arr, dtype=dtype, copy=copy)
  File "/opt/conda/envs/rapids/lib/python3.10/site-packages/pandas/core/arrays/masked.py", line 132, in _from_sequence
    values, mask = cls._coerce_to_array(scalars, dtype=dtype, copy=copy)
  File "/opt/conda/envs/rapids/lib/python3.10/site-packages/pandas/core/arrays/boolean.py", line 344, in _coerce_to_array
    return coerce_to_array(value, copy=copy)
  File "/opt/conda/envs/rapids/lib/python3.10/site-packages/pandas/core/arrays/boolean.py", line 194, in coerce_to_array
    raise TypeError("Need to pass bool-like values")
TypeError: Need to pass bool-like values

Anything else we need to know?:

Environment:

@qwebug qwebug added bug Something isn't working needs triage Awaiting triage by a dask-sql maintainer labels Jun 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs triage Awaiting triage by a dask-sql maintainer
Projects
None yet
Development

No branches or pull requests

1 participant