Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge form pulsar master #3

Merged
merged 74 commits into from
Mar 10, 2020
Merged

Merge form pulsar master #3

merged 74 commits into from
Mar 10, 2020

Conversation

kaynewu
Copy link
Owner

@kaynewu kaynewu commented Mar 10, 2020

merge request

ltamber and others added 30 commits February 15, 2020 21:27
…6187)

Fixes #5904 

### Motivation
Pulsar supports unload a non-partitioned-topic or a partition of a partitioned topic. If there has a partitioned topic with too many partitions, users need to get all partition and unload them one by one. We need to support unload all partition of a partitioned topic.
…ctions for production-readiness (#6104)

This PR is to provide integration tests that test execution of Go functions that are managed by the Java FunctionManager. This will allow us to test things like behavior during function timeouts, heartbeat failures, and other situations that can only be effectively tested in an integration test. 

Master issue: #4175
Fixes issue: #6204 

### Modifications

We must add Go to the integration testing logic. We must also build the Go dependencies into the test Dockerfile to ensure the Go binaries are available at runtime for the integration tests.
### Motivation

Fixes #5999

### Modifications

Add the logic to handle the blank cluster name.
…er OOM (#6178)

Motivation
Introduce maxMessagePublishBufferSizeInMB configuration to avoid broker OOM.

Modifications
If the processing message size exceeds this value, the broker will stop read data from the connection. When available size > half of the maxMessagePublishBufferSizeInMB, start auto-read data from the connection.
Fixes #6045 #6281 

### Motivation

Enable get precise backlog and backlog without delayed messages.

### Verifying this change

Added new unit tests for the change.
Fixes #5560

### Motivation

Currently, Pulsar SQL can't read the keyValue schema data. This PR added support Pulsar SQL reading messages with a key-value schema.

### Modifications

Add KeyValue schema support for Pulsar SQL. Add prefix __key. for the key field name.
#6339)

Motivation

To avoid get partition metadata while the topic name is a partition name.
Currently, if users want to skip all messages for a partitioned topic or unload a partitioned topic, the broker will call get topic metadata many times. For a topic with the partition name, it is not necessary to call get partitioned topic metadata again.
…aml (#6340)

Fixes #6338

### Motivation
This commit started while I was using helm in my local minikube, noticed that there's a mismatch between `values-mini.yaml` and `values.yaml` files. At first I thought it was a copy/paste error. So I created #6338;

Then I looked into the details how these env-vars[ were used](https://github.com/apache/pulsar/blob/28875d5abc4cd13a3e9cc4f59524d2566d9f9f05/conf/bkenv.sh#L36), found out its ok to use `PULSAR_MEM` as an alternative. But it introduce problems:
1. Since `BOOKIE_GC` was not defined , the default [BOOKIE_EXTRA_OPTS](https://github.com/apache/pulsar/blob/28875d5abc4cd13a3e9cc4f59524d2566d9f9f05/conf/bkenv.sh#L39)  will finally use default value of `BOOKIE_GC`, thus would cover same the JVM parameters defined prior in `PULSAR_MEM`.
2. May cause problems when bootstrap scripts changed in later dev, better to make it explicitly.

So I create this pr to solve above problems(hidden trouble).

### Modifications

As mentioned above, I've made such modifications below:
1. make `BOOKIE_MEM` and `BOOKIE_GC` explicit in `values-mini.yaml` file.  Keep up with the format in`values.yaml` file.
2. remove all  print-gc-logs related args. Considering the resource constraints of minikube environment. The removed part's content is `-XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintGCApplicationStoppedTime -XX:+PrintHeapAtGC -verbosegc -XX:G1LogLevel=finest`
3. leave `PULSAR_PREFIX_dbStorage_rocksDB_blockCacheSize` empty as usual, as [conf/standalone.conf#L576](https://github.com/apache/pulsar/blob/df152109415f2b10dd83e8afe50d9db7ab7cbad5/conf/standalone.conf#L576) says it would to use 10% of the direct memory size by default.
The key shared policy does not support setting the maximum key hash range, so fix the java doc.
…6337)

Currently, SubscriptionMode is a parameter to create ConsumerImpl, but it is not exported out, and user could not set this value for consumer.  This change tries to make SubscriptionMode a member of ConsumerConfigurationData, so user could set this parameter when create consumer.
* Corrected method of specifying Windows path to LLVM tools

* Fixing windows build

* Corrected the dll install path

* Fixing pulsarShared paths
"fatal: reference is not a tree" is a known issue in actions/checkout#23 and fixed in checkout@v2, update checkout used in GitHub actions.
Fixes #6260 

Snappy, like other compressions (LZ4, ZSTD), depends on native libraries to do the real encode/decode stuff. When we shade them in a fat jar, only the java implementations of snappy class are shaded, however, left the JNI incompatible with the underlying c++ code.

We should just remove the shade for snappy, and let maven import its lib as a dependency.

I've tested the shaded jar locally generated by this pr, it works for all compression codecs.
In #6386 , checkout@v2 is brought in for checkout.

However, it's checking out PR merge commit by default, therefore breaks diff-only action which looking for commits that a PR is based on. And make all tests skipped.

This PR fixes this issue. and has been proven to work with #6396 Brokers/unit-tests.
…6373)

This applies the recommended fix from
#6355 (comment)

Fixes #6355

### Motivation

This PR corrects the configmap data which was causing the autorecovery pod to crashloop
with `could not find or load main class`

### Modifications

Updated the configmap var data per [this comment](#6355 (comment)) from @sijie
)

### Motivation

Creating a topic does not wait for creating cursor of replicators

## Verifying this change

The exists unit test can cover this change
…l back duration. (#6392)

Currently, when constructing a reader, users can set both start message id and start time. 

This is strange and the behavior should be forbidden.
The current logic for `resetCursor` by timestamp is odd. The first message it returns is the last message earlier or equal to the designated timestamp. This "earlier" message should be avoided to emit.
Four kinds of errors are fixed in this PR:

- Array index out of bounds
- Inconsistent equals and hashCode
- Missing format argument
- Reference equality test of boxed types

According to https://lgtm.com/projects/g/apache/pulsar/alerts/?mode=tree&severity=error&id=&lang=java
…essage in batch. (#6345)

Fixes #6344 
Fixes #6350

The bug was brought in #5622 by changing the skip logic wrongly.
### Motivation

Fixes #6343

### Modifications

Add a method to cast object value to `String`.
Fixes #6400

### Motivation
This problem is blocking the current test. 1.1.8 version of `enum34` seems to have some problems, and the problem reproduces:

Use pulsar latest code:
```
cd pulsar
mvn clean install -DskipTests
dokcer pull apachepulsar/pulsar-build:ubuntu-16.04
docker run -it -v $PWD:/pulsar --name pulsar apachepulsar/pulsar-build:ubuntu-16.04 /bin/bash
docker exec -it pulsar /bin/bash
cmake .
make -j4 && make install 
cd python
python setup.py bdist_wheel
pip install dist/pulsar_client-*-linux_x86_64.whl
```
`pip show enum34`
```
Name: enum34
Version: 1.1.8
Summary: Python 3.4 Enum backported to 3.3, 3.2, 3.1, 2.7, 2.6, 2.5, and 2.4
Home-page: https://bitbucket.org/stoneleaf/enum34
Author: Ethan Furman
Author-email: [email protected]
License: BSD License
Location: /usr/local/lib/python2.7/dist-packages
Requires:
Required-by: pulsar-client, grpcio
```

```
root@55e06c5c770f:/pulsar/pulsar-client-cpp/python# python
Python 2.7.12 (default, Oct  8 2019, 14:14:10)
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from enum import Enum, EnumMeta
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ImportError: No module named enum
>>> exit()
```

There is no problem with using 1.1.9 in the test.

### Modifications

* Upgrade enum34 from 1.1.8 to 1.1.9

### Verifying this change

local test pass
when broker create the inside client, it sets tlsTrustCertsFilePath as "getTlsCertificateFilePath()", but it should be "getBrokerClientTrustCertsFilePath()"
…elated to shading in pulsar-client. (#6406)

Motivation
Avro schemas are quite important for proper data flow and it is a pity that the #3762 issue stayed untouched for so long. There were some workarounds on how to make Pulsar use an original avro schema, but in the end, it is pretty hard to run an enterprise solution on workarounds. With this PR I would like to find a solution to the problem caused by shading avro in pulsar-client. As it was discussed in the issue, there are two possible solutions for this problem:

Unshade the avro library in the pulsar-client library. (IMHO it seems like a proper solution for this problem, but it also brings a risk of unknown side-effects)
Use reflection to get original schemas from generated classes. (I went for this solution)
Could you please comment if this is a proper solution for the problem? I will add tests when my approach will be confirmed.

Modifications
First, we try to extract an original avro schema from the "$SCHEMA" field using reflection. If it doesn't work, the process falls back generation of the schema from POJO.
### Motivation

Add verification for SchemaDefinitionBuilderImpl.java

### Verifying this change

Added a new unit test.
### Modifications

- Removed dependencies on test libraries that were already imported in the parent pom file.

- Removed groupId tags that are inherited from the parent pom file.
BatchReceivePolicy implements Serializable.
racorn and others added 28 commits March 2, 2020 23:21
Fixes #6453 

### Motivation
`ConsumerBase` and `ProducerImpl` use `System.currentTimeMillis()` to measure the elapsed time in the 'operations' inner classes (`ConsumerBase$OpBatchReceive` and `ProducerImpl$OpSendMsg`).

An instance variable `createdAt` is initialized with `System.currentTimeMills()`, but it is not used for reading wall clock time, the variable is only used for computing elapsed time (e.g. timeout for a batch).

When the variable is used to compute elapsed time, it would more sense to use `System.nanoTime()`.

### Modifications

The instance variable `createdAt` in `ConsumerBase$OpBatchReceive` and  `ProducerImpl$OpSendMsg` is initialized with `System.nanoTime()`. Usage of the variable is updated to reflect that the variable holds nano time; computations of elapsed time takes the difference between the current system nano time and the `createdAt` variable.

The `createdAt` field is package protected, and is currently only used in the declaring class and outer class, limiting the chances for unwanted side effects.
* Fixed the max backoff configuration for lookups

* Fixed test expectation

* More test fixes
### Motivation
The Pulsar examples include some third-party libraries with security vulnerabilities.
- log4j-core-2.8.1
https://www.cvedetails.com/cve/CVE-2017-5645

### Modifications

- Upgraded the version of scala-maven-plugin from 4.0.1 to 4.1.0. log4j-core-2.8.1 were installed because scala-maven-plugin depends on it.
### Motivation
Proxy-logging fetches incorrect producerId for `Send` command because of that logging always gets producerId as 0 and it fetches invalid topic name for the logging.

### Modification
Fixed topic logging by fetching correct producerId for `Send` command.
…ons (#6456)

### Motivation

Fixes #6394

### Modifications

- provide a flag `allowAutoSubscriptionCreation` in `ServiceConfiguration`, defaults to `true`
- when `allowAutoSubscriptionCreation` is disabled and the specified subscription (`Durable`) on the topic does not exist when trying to subscribe via a consumer, the server should reject the request directly by `handleSubscribe` in `ServerCnx`
- create the subscription on the coordination topic if it does not exist when init `WorkerService`
Similar to the change you already merged for AvroSchemaTest.java(#6247):
`jsonSchema.getSchemaInfo().getSchema()` in `pulsar-client/src/test/java/org/apache/pulsar/client/impl/schema/JSONSchemaTest.java` returns a JSON object. `schemaJson` compares with hard-coded JSON String. However, the order of entries in `schemaJson` is not guaranteed. Similarly, test `testKeyValueSchemaInfoToString` in `pulsar-client/src/test/java/org/apache/pulsar/client/impl/schema/KeyValueSchemaInfoTest.java` returns a JSON object. `havePrimitiveType` compares with hard-coded JSON String, and the order of entries in `havePrimitiveType` is not guaranteed.


This PR proposes to use JSONAssert and modify the corresponding JSON test assertions so that the test is more stable.

### Motivation

Using JSONAssert and modifying the corresponding JSON test assertions so that the test is more stable.

### Modifications

Adding `assertJSONEqual` method and replacing `assertEquals` with it in tests `testAllowNullSchema`, `testNotAllowNullSchema` and `testKeyValueSchemaInfoToString`.
* Enhance Authorization by adding TenantAdmin interface

* Remove debugging comment

Co-authored-by: Sanjeev Kulkarni <[email protected]>
### Motivation

Master Issue: #5454 

When one Consumer subscribe multi topic, setSchemaInfoPorvider() will be covered by the consumer generated by the last topic.

### Modification
clone schema for each consumer generated by topic.
### Verifying this change
Add the schemaTest for it.
Fixes #6482

### Motivation
Prevent topic compaction from leaking direct memory

### Modifications

Several leaks were discovered using Netty leak detection and code review.
* `CompactedTopicImpl.readOneMessageId` would get an `Enumeration` of `LedgerEntry`, but did not release the underlying buffers. Fix: iterate though the `Enumeration` and release underlying buffer. Instead of logging the case where the `Enumeration` did not contain any elements, complete the future exceptionally with the message (will be logged by Caffeine).
* Two main sources of leak in `TwoPhaseCompactor`. The `RawBacthConverter.rebatchMessage` method failed to close/release a `ByteBuf` (uncompressedPayload). Also, the return ByteBuf of `RawBacthConverter.rebatchMessage` was not closed. The first one was easy to fix (release buffer), to fix the second one and make the code easier to read, I decided to not let `RawBacthConverter.rebatchMessage`  close the message read from the topic, instead the message read from the topic can be closed in a try/finally clause surrounding most of the method body handing a message from a topic (in phase two loop). Then if a new message was produced by `RawBacthConverter.rebatchMessage` we check that after we have added the message to the compact ledger and release the message.

### Verifying this change
Modified `RawReaderTest.testBatchingRebatch` to show new contract.

One can run the test described to reproduce the issue, to verify no leak is detected.
…me. (#6478)

Fixes #6468

Fix create a partitioned topic with a substring of an existing topic name. And make create partitioned topic async.
In jclouds 2.2.0, the [gson is shaded internally](https://issues.apache.org/jira/browse/JCLOUDS-1166). We could safely remove the jcloud-shade module as a cleanup.
### Modifications

The main modification was the reduction of repeated initialization of the variables in the tests.
### Motivation

*Explain here the context, and why you're making that change. What is the problem you're trying to solve.*

Motivation is to have correct reference-metrics documentation.

### Modifications

*Describe the modifications you've done.*

There is an error in the `Topic metrics` section

`pulsar_producers_count` => `pulsar_in_messages_total`
### Motivation
Remove duplicate `cnx()` method for `producer`
### Motivation


Currently, the proxy only works to proxy v1/v2 functions routes to the
function worker.

### Modifications

This changes this code to proxy all routes for the function worker when
those routes match. At the moment this is still a static list of
prefixes, but in the future it may be possible to have this list of
prefixes be dynamically fetched from the REST routes.

### Verifying this change
- added some tests to ensure the routing works as expected
### Motivation
Right now, various pulsar-modules have duplicate `RestException` class  and repo has multiple duplicate class. So, move `RestException` to common place and all modules should use the same Exception class to avoid duplicate classes.
### Motivation
fix correct name for proxy thread executor name
### Motivation

In some case, users expect to consume messages from beginning similar to the option `--from-beginning` of kafka consumer CLI. 

### Modifications

Add `--subscription-position` for `pulsar-client` and `pulsar-perf`.
…er. (#6499)

### Motivation

If the broker service is started, the client can connect to the broker and send requests depends on the namespace service, so we should create the namespace service before starting the broker. Otherwise, NPE occurs.

![image](https://user-images.githubusercontent.com/12592133/76090515-a9961400-5ff6-11ea-9077-cb8e79fa27c0.png)

![image](https://user-images.githubusercontent.com/12592133/76099838-b15db480-6006-11ea-8f39-31d820563c88.png)


### Modifications

Move the namespace service creation and the schema registry service creation before start broker service.
…er When Ack Messages . (#6498)

### Motivation
Because of #6391 , acked messages were counted as unacked messages. 
Although messages from brokers were acknowledged, the following log was output.

```
2020-03-06 19:44:51.790 INFO  ConsumerImpl:174 | [persistent:https://public/default/t1, sub1, 0] Created consumer on broker [127.0.0.1:58860 -> 127.0.0.1:6650]
my-message-0: Fri Mar  6 19:45:05 2020
my-message-1: Fri Mar  6 19:45:05 2020
my-message-2: Fri Mar  6 19:45:05 2020
2020-03-06 19:45:15.818 INFO  UnAckedMessageTrackerEnabled:53 | [persistent:https://public/default/t1, sub1, 0] : 3 Messages were not acked within 10000 time

```

This behavior happened on master branch.
### Modification
`ProxyConfig` has wrapper method for `proxyLogLevel` to present `Optional` data-type. after #3543 we can define config param as optional without creating wrapper methods.
@kaynewu
Copy link
Owner Author

kaynewu commented Mar 10, 2020

merge check

@kaynewu kaynewu merged commit 42343f8 into kaynewu:master Mar 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet