Skip to content
This repository has been archived by the owner on Nov 19, 2021. It is now read-only.

Using TPT Update driver on macOS causes segfault #63

Open
ChrisRx opened this issue Sep 9, 2018 · 7 comments
Open

Using TPT Update driver on macOS causes segfault #63

ChrisRx opened this issue Sep 9, 2018 · 7 comments
Labels

Comments

@ChrisRx
Copy link
Contributor

ChrisRx commented Sep 9, 2018

While the TPT Export driver works as expected on macOS, the Update driver fails to complete the Initiate method with Segmentation Fault: 11. It has something to do with the lifetime of objects for the TPT, mostly teradata::client::API::Connection, and how the compiler and/or platform handles the code differently.

@ChrisRx ChrisRx added the bug label Sep 9, 2018
@ChrisRx
Copy link
Contributor Author

ChrisRx commented Sep 13, 2018

After looking into it more, it unfortunately looks like a bug in the TPT shared object on macOS for at least version 15.10 (the version of the TTU bundle I have available for macOS has patch version 9, but in the version file the TPT appears to be patch version 1). The issue appears to be caused by a class member mDefaultArraySupport being free'd incorrectly. What's more interesting is that compiling it with the -m32 flag, which tells clang to compile it as 32-bit, seems to fix the problem. This isn't viable, however, as most people will want, and should, use 64-bit Python on all platforms and if the extension is compiled as 32-bit it can only be used by 32-bit Python. It is also worth nothing that the -mx32 flag which, compiles the target with 32-bit word size, while keeping the resulting binary 64-bit, does not seem to fix the issue.

With that said, I think there are two unfortunate answers to this issue. The first is finding a newer copy of the TTU for macOS and hope it is fixed (it isn't available online and I am having a lot of trouble tracking it down elsewhere). The other is with a build macro that prevents freeing teradata::client::API::Connection on macOS for at least TTU version 15.10, resulting in that memory leaking.

@ausiddiqui
Copy link

I am getting a segmentation fault on Bulkload. Am getting it on something as simple as this:

import giraffez
import pandas as pd
import os

filepath = 'Users/ash/test.csv'
table = 'mydb.mytable'
system=os.environ.get('TERADATA_SYSTEM')
username=os.environ.get('TERADATA_USER')
password=os.environ.get('TERADATA_PASS')

df = pd.DataFrame({'a':[1,3,5],'b':[4,6,8],'c':[2,7,9]})
df.to_csv(filepath, index=False)

q = """
create table mydb.mytable (
    a int,
    b int,
    c int
) primary index (a);
"""

with giraffez.Cmd(host=system, username=username, password=password) as cmd:
    cmd.execute(q)

with giraffez.BulkLoad(table=table, host=system, username=username, password=password, coerce_floats=False, cleanup=True) as load:
    load.from_file(filepath, table=table, delimiter=",", null='')

Python: 3.5.5
giraffez: 2.0.24 and 2.0.23
TTU: 16.1
OS: macOS 10.14.2

@ausiddiqui
Copy link

Hi Chris,

Any updates on this? I am running using TTU 16.10 on macOS and that's also having segmentation fault errors. If you give me a set of instructions I can at least find out if what you've discovered is the same issue on newer versions of TPT. I only have problems while running BulkLoad and not for Cmd or Export. Thanks.

@ChrisRx
Copy link
Contributor Author

ChrisRx commented Jan 17, 2019

I'm a little shocked that you are experiencing that on 16.10 as well. Unless I'm mistaken the macOS shared libraries for TPT API appear to have a bug that results in a double free in the destructor chain when using correctly, and I just honestly figured they would have fixed that in subsequent versions (I only have access to v15.10 for macOS so I couldn't confirm myself). It's been months since I was looking into it, but tracing it as far as I could (given that the proprietary shared libraries do not leave a lot to debug) it appeared to be a either a free on an uninitialized class member or a double free. I could add some platform-specific code to prevent the cleaning up of the C++ classes on macOS but that feels wrong (like I must be missing something, or Teradata is and they are just unaware that their macOS libs are busted).

First, I appreciate you reporting this and other stuff! I think the next steps might be to see if this compiles and runs for you:

// seggie.cpp
#include <connection.h>

using namespace teradata::client::API;

int main() {
    Connection *conn = new Connection();
    delete conn;
    return 0;
}

If it helps compile with something like:

export TERADATA_HOME=/Library/Application\ Support/teradata/client/16.10
clang++ -O3 -g -Wall -I $TERADATA_HOME/tbuild/tptapi/inc -L $TERADATA_HOME/tbuild/lib -ltelapi -o seggie seggie.cpp

This is the smallest example showcasing the behavior. Doing the same thing on all other platforms does not result in memory corruption (just macOS). If it ends up doing the same for you, we either need a macOS/C++ expert, or just need to report the bug on the Teradata forums with the given files to reproduce it. Let me know what ends up happening when you run this.

@ausiddiqui
Copy link

This creates a unix executable file named seggie and also a package named seggie.dYSM. Not sure what's the best way to share these here.

@ausiddiqui
Copy link

I was able to compile the ext-cleanup branch and no longer getting this issue. Now on macOS 10.14.3.

@ausiddiqui
Copy link

I'm still getting this issue on Ubuntu, CentOS and macOS but depends on the data content being uploaded. Some specific datasets no matter if the empty table is fully defined as VARCHAR only with no null values I'm still getting segfault. Some other datasets this never happens.

records.csv
aID,Name,Branch,YR,CGPA
102,Abcd,COE,2,9.0
109,Bcde,COE,2,9.1
100,Cdef,IT,2,9.3
103,Defg,SE,1,9.5
106,Efgh,MCE,3,7.8
112,Fghi,EP,2,9.1
116,Ghij,IT,3,9.9
122,Hijk,EP,1,8.9
120,Ijkl,SE,2,9.2
110,Jklm,MCE,3,7.9
def exc(q):
    with g.Cmd(host=system, username=username, password=password) as cmd:
        cmd.execute(q)

table = 'mydb.mytable'
exc("drop table mydb.mytable")
exc("""
    create multiset table mydb.mytable (
        aID int,
        Name varchar(50) character set unicode not casespecific,
        Branch varchar(5) character set unicode not casespecific,
        YR int,
        CGPA varchar(5) character set unicode not casespecific
    ) primary index (aID);
    """)

with g.BulkLoad(table=table, host=system, username=username, password=password, coerce_floats=False, cleanup=True, print_error_table=True) as load:
    load.from_file('./records.csv', table=table, delimiter=",", null='')

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants