Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ISSUE #4331]Add top-level interfaces of jdbc source connector #4332

Merged
merged 1 commit into from
Aug 29, 2023

Conversation

mxsm
Copy link
Member

@mxsm mxsm commented Aug 4, 2023

Fixes #4331

Motivation

Explain the content here.
Explain why you want to make the changes and what problem you're trying to solve.

Modifications

Describe the modifications you've done.

  • Add top-level interfaces of jdbc source connector

Documentation

  • Does this pull request introduce a new feature? (yes / no)
  • If yes, how is the feature documented? (not applicable / docs / JavaDocs / not documented)
  • If a feature is not applicable for documentation, explain why?
  • If a feature is not documented yet in this PR, please create a followup issue for adding the documentation

@mxsm mxsm requested a review from xwm1992 August 4, 2023 14:11
@mxsm mxsm changed the title [ISSUE #4331]add source-connector-jdbc(Support Mysql) module [ISSUE #4331]Add jdbc source connector support mysql Aug 4, 2023
@codecov
Copy link

codecov bot commented Aug 4, 2023

Codecov Report

Merging #4332 (8a7d799) into master (1e59cdf) will decrease coverage by 0.32%.
The diff coverage is 8.30%.

❗ Current head 8a7d799 differs from pull request most recent head 802b96c. Consider uploading reports for the commit 802b96c to get more accurate results

@@             Coverage Diff              @@
##             master    #4332      +/-   ##
============================================
- Coverage     17.80%   17.49%   -0.32%     
- Complexity     1516     1544      +28     
============================================
  Files           611      663      +52     
  Lines         25598    26441     +843     
  Branches       2377     2439      +62     
============================================
+ Hits           4559     4627      +68     
- Misses        20601    21364     +763     
- Partials        438      450      +12     
Files Changed Coverage Δ
...pache/eventmesh/connector/jdbc/CatalogChanges.java 0.00% <0.00%> (ø)
...ava/org/apache/eventmesh/connector/jdbc/Field.java 0.00% <0.00%> (ø)
...ache/eventmesh/connector/jdbc/JdbcConnectData.java 0.00% <0.00%> (ø)
...e/eventmesh/connector/jdbc/JdbcDriverMetaData.java 0.00% <0.00%> (ø)
...a/org/apache/eventmesh/connector/jdbc/Payload.java 0.00% <0.00%> (ø)
...va/org/apache/eventmesh/connector/jdbc/Schema.java 0.00% <0.00%> (ø)
...he/eventmesh/connector/jdbc/config/JdbcConfig.java 0.00% <0.00%> (ø)
...ntmesh/connector/jdbc/config/JdbcServerConfig.java 0.00% <0.00%> (ø)
...mesh/connector/jdbc/connection/JdbcConnection.java 0.00% <0.00%> (ø)
...h/connector/jdbc/connection/PreparedParameter.java 0.00% <0.00%> (ø)
... and 43 more

... and 1 file with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@Alonexc
Copy link
Contributor

Alonexc commented Aug 11, 2023

image Merging the master branch will fix this build error.

@mxsm
Copy link
Member Author

mxsm commented Aug 11, 2023

image Merging the master branch will fix this build error.

I will do it late

@mxsm mxsm force-pushed the eventmesh-connector-jdbc branch 2 times, most recently from 02b93b1 to bbd17f6 Compare August 12, 2023 02:04
@qqeasonchen
Copy link
Contributor

@mxsm too many files changed, could you please split it?

@mxsm
Copy link
Member Author

mxsm commented Aug 17, 2023

@mxsm too many files changed, could you please split it?

I will

@HattoriHenzo
Copy link
Contributor

@mxsm The way I was looking this implementation is a bit different. What do you think about a wrapper API around vanilla jdbc with the ability to load a database driver depending on the source(MySQL, PostgreSQL, SQL Server etc...) or the sink(MySQL, PostgreSQL, SQL Server etc..).

@mxsm
Copy link
Member Author

mxsm commented Aug 22, 2023

@mxsm The way I was looking this implementation is a bit different. What do you think about a wrapper API around vanilla jdbc with the ability to load a database driver depending on the source(MySQL, PostgreSQL, SQL Server etc...) or the sink(MySQL, PostgreSQL, SQL Server etc..).

@Pil0tXia @HattoriHenzo
Thank you for your suggestion. Here is an abstract architecture diagram of my code:
The diagram is divided into three parts:

image

The main idea of the design is to parse the data model of the database supporting Jdbc into the corresponding EventMesh event model for processing.Provide a universal interface for interactivity with the database layer and load different database drivers through protocols. Here, I provide dynamic loading of database drivers through JdbcConnection. Later, additional database support can be added by simply including the corresponding database JDBC driver. When the module starts, it automatically loads the corresponding driver based on the configured JDBC protocol.

  1. The JdbcConnection API wrapper part is mainly responsible for loading different drivers based on the configured JDBC protocol and creating a connection for the engine to use, for example, configuring jdbc:mysql:https://localhost:3306 would load the MySQL driver.The JdbcConnection can load different database drivers based on different protocols.

  2. The abstract engine module is divided into two types: a snapshot engine for processing data snapshots and a CDC engine for processing database CDC. The engine is responsible for converting data into custom events within EventMesh.

  3. Task management mainly manages tasks for processing events. Tasks are divided into two categories: tasks for processing snapshot events (these tasks can be closed after snapshot processing is complete), and tasks for processing CDC events.

I will improve the code and ask for your help to review it later.

@HattoriHenzo
Copy link
Contributor

HattoriHenzo commented Aug 22, 2023

@mxsm thanks for the details. Here is a more detailed schema that represent my approach.

eventmesh-connector

  1. The configuration file contains the jdbc driver to load with a factory class
  2. The source/sink are implemented using the eventmesh-openconnect-java such as eventmesh-connector-mongodb or eventmesh-connector-redis
  3. java.sql classes a wrapped by the source/sink classes to interact with the underlined database

@mxsm
Copy link
Member Author

mxsm commented Aug 23, 2023

@HattoriHenzo Thanks for the visual illustration!
image

  1. JdbcConnection provides a wrapper for java.sql, such as obtaining a connection, setting the connection's autocommit property, and more. For specific database queries, JdbcConnection can be subclassed for implementation. JdbcConnection provides general interfaces for things like executing SQL statements.
  2. The connection factory is responsible for creating JdbcConnection connections. Here, java.sql.DriverManager is used to automatically load database drivers based on the jdbc protocol configured in a configuration file.
  3. For the Jdbc Source, since it needs to support Cdc, the implementation of Cdc is different for different data types. The relevant configuration for Cdc for a specific database is added in the Source configuration file.
  4. The JDBC Sink is mainly responsible for inserting data into the corresponding database. Only the general configuration for JDBC needs to be configured, and then the data from the Source is converted into the data needed by the Sink based on the type of database and saved.

The overall architecture design is basically the same as yours, but there are some differences in nomenclature and implementation details. I will split the PR into smaller chunks to facilitate code review by others.

@HattoriHenzo
Copy link
Contributor

@mxsm Thanks for your reply. Your explanation is very clear.

@qqeasonchen
Copy link
Contributor

@mxsm Have you done your split?

@mxsm
Copy link
Member Author

mxsm commented Aug 25, 2023

@qqeasonchen
I am still working on further optimizing the overall code and data format sent to EventMesh. The corresponding PR will be submitted in the next few days. The PR will be submitted in the following sections:

  1. Top-level interfaces
  2. MySQL snapshot processing implementation
  3. MySQL CDC processing implementation

After the overall MySQLConnector source is implemented, MySQL, PG, and Oracle sinks will be implemented first.

@mxsm mxsm force-pushed the eventmesh-connector-jdbc branch 4 times, most recently from 6582689 to dd67344 Compare August 27, 2023 05:35
@mxsm
Copy link
Member Author

mxsm commented Aug 27, 2023

@mxsm Have you done your split?

@qqeasonchen Have done

@mxsm mxsm requested a review from qqeasonchen August 27, 2023 08:23
@qqeasonchen
Copy link
Contributor

@mxsm Have you done your split?

@qqeasonchen Have done

137 file changed is too risky.

@mxsm
Copy link
Member Author

mxsm commented Aug 28, 2023

@mxsm Have you done your split?

@qqeasonchen Have done

137 file changed is too risky.

@qqeasonchen I will remove the database abstraction module and event module, leaving only the code at the interface level.

@qqeasonchen
Copy link
Contributor

@mxsm Have you done your split?

@qqeasonchen Have done

137 file changed is too risky.

@qqeasonchen I will remove the database abstraction module and event module, leaving only the code at the interface level.

you can submit only jdbc-connector module in one pr, and other changes in other prs

@mxsm
Copy link
Member Author

mxsm commented Aug 28, 2023

@qqeasonchen Based on the current PR, the changes are mainly to the jdbc connector module. During the previous implementation, the changes in PR #4131 were included, but PR #4131 has been in the code review and merge state. Now that I have made changes, only the interface-level code is included in the current PR. I was wondering if you think it’s okay to review and merge PR #4131.

@mxsm
Copy link
Member Author

mxsm commented Aug 28, 2023

@qqeasonchen The current code only includes the jdbcconnector module, but I need help reviewing the code in PR #4131 to solve a problem in the subsequent SPI loading.

@qqeasonchen
Copy link
Contributor

@qqeasonchen The current code only includes the jdbcconnector module, but I need help reviewing the code in PR #4131 to solve a problem in the subsequent SPI loading.

then we need to review #4131 first, you can ping anyone you want

@mxsm mxsm requested a review from pandaapo August 29, 2023 05:48
@mxsm mxsm changed the title [ISSUE #4331]Add jdbc source connector support mysql [ISSUE #4331]Add top-level interfaces of jdbc source connector Aug 29, 2023
@qqeasonchen qqeasonchen merged commit eea1d01 into apache:master Aug 29, 2023
7 checks passed
@mxsm mxsm deleted the eventmesh-connector-jdbc branch August 29, 2023 15:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature] Add source-connector-jdbc top-level interfaces
5 participants