Skip to content

Import data in CSV into Cassandra (Considering Consistency Level)

Notifications You must be signed in to change notification settings

nehalmehta/CSV2Cassandra

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Import CSV Data into Cassandra:

Utility can be used to import large CSV data into  Cassandra Column family.   It is used to import large CSV files (probably exported from SQL DBs like MySQL , PostGre SQL) into Cassandra. If required consistency level can also be maintained  while carrying out imports.  I had to import 10GB+ CSV files into cassandra, but only if  nodes in each zone were up  so consistency level was used. 

Utilitiy is built using https://github.com/rantav/hector API to import data into Cassandra.

Steps:
1. Configure following parameters at config/config.xml (Sample Config File is already present):
	1. hostName 
	2. port
	3. clusterName
	4. consistencyLevel
	5. columnFamily
	6. keySpace
	7. importFile
	8. delimeter
2. After configuration just run the utility, and  keeping checking logs if you think it is  taking  long time.


Assumptions: 

   First row of CSV is header row i.e column names for our column family
and similarly first column is key. 

Wish List: 

   1. Mapping of multiple CSV into Single Column Family and vice versa.
   2. Configurable column for keys.


-- THE UTILITY IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
THE SOFTWARE.


About

Import data in CSV into Cassandra (Considering Consistency Level)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages