Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

services/horizon/docker/ledgerexporter: deploy ledgerexporter image as service #4490

Merged
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 17 additions & 8 deletions exp/services/ledgerexporter/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ import (
"strings"
"time"

"github.com/aws/aws-sdk-go/service/s3"
"github.com/stellar/go/historyarchive"
"github.com/stellar/go/ingest/ledgerbackend"
"github.com/stellar/go/network"
Expand All @@ -30,6 +31,7 @@ func main() {
continueFromLatestLedger := flag.Bool("continue", false, "start export from the last exported ledger (as indicated in the target's /latest path)")
endingLedger := flag.Uint("end-ledger", 0, "ledger at which to stop the export (must be a closed ledger), 0 means no ending")
writeLatestPath := flag.Bool("write-latest-path", true, "update the value of the /latest path on the target")
captiveCoreUseDb := flag.Bool("captive-core-use-db", true, "configure captive core to store database on disk in working directory rather than in memory")
flag.Parse()

logger.SetLevel(supportlog.InfoLevel)
Expand All @@ -52,14 +54,16 @@ func main() {
CheckpointFrequency: 64,
Log: logger.WithField("subservice", "stellar-core"),
Toml: captiveCoreToml,
UseDB: *captiveCoreUseDb,
}
core, err := ledgerbackend.NewCaptive(captiveConfig)
logFatalIf(err, "Could not create captive core instance")

target, err := historyarchive.ConnectBackend(
*targetUrl,
storage.ConnectOptions{
Context: context.Background(),
Context: context.Background(),
S3WriteACL: s3.ObjectCannedACLBucketOwnerFullControl,
},
)
logFatalIf(err, "Could not connect to target")
Expand All @@ -68,18 +72,23 @@ func main() {
// Build the appropriate range for the given backend state.
startLedger := uint32(*startingLedger)
endLedger := uint32(*endingLedger)
if startLedger < 2 {
logger.Fatalf("-start-ledger must be >= 2")
}
if endLedger != 0 && endLedger < startLedger {
logger.Fatalf("-end-ledger must be >= -start-ledger")
}

logger.Infof("processing requested range of -start-ledger=%v, -end-ledger=%v", startLedger, endLedger)
if *continueFromLatestLedger {
if startLedger != 0 {
logger.Fatalf("-start-ledger and -continue cannot both be set")
}
startLedger = readLatestLedger(target)
logger.Infof("continue flag was enabled, next ledger found was %v", startLedger)
}

if startLedger < 2 {
logger.Fatalf("-start-ledger must be >= 2")
}
if endLedger != 0 && endLedger < startLedger {
logger.Fatalf("-end-ledger must be >= -start-ledger")
}

var ledgerRange ledgerbackend.Range
if endLedger == 0 {
ledgerRange = ledgerbackend.UnboundedRange(startLedger)
Expand All @@ -91,7 +100,7 @@ func main() {
err = core.PrepareRange(context.Background(), ledgerRange)
logFatalIf(err, "could not prepare range")

for nextLedger := startLedger; nextLedger <= endLedger; {
for nextLedger := startLedger; endLedger < 1 || nextLedger <= endLedger; {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

during testing with END=0, ran into this as it was stopping before generation anything

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch

ledger, err := core.GetLedger(context.Background(), nextLedger)
if err != nil {
logger.WithError(err).Warnf("could not fetch ledger %v, retrying", nextLedger)
Expand Down
1 change: 1 addition & 0 deletions services/horizon/docker/ledgerexporter/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ RUN apt-get update && apt-get install -y stellar-core=${STELLAR_CORE_VERSION}
RUN apt-get clean

ADD captive-core-pubnet.cfg /
ADD captive-core-testnet.cfg /

ADD start /
RUN ["chmod", "+x", "start"]
Expand Down
30 changes: 30 additions & 0 deletions services/horizon/docker/ledgerexporter/captive-core-testnet.cfg
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
PEER_PORT=11725
DATABASE = "sqlite3:https:///cc/stellar.db"

UNSAFE_QUORUM=true
FAILURE_SAFETY=1

[[HOME_DOMAINS]]
HOME_DOMAIN="testnet.stellar.org"
QUALITY="HIGH"

[[VALIDATORS]]
NAME="sdf_testnet_1"
HOME_DOMAIN="testnet.stellar.org"
PUBLIC_KEY="GDKXE2OZMJIPOSLNA6N6F2BVCI3O777I2OOC4BV7VOYUEHYX7RTRYA7Y"
ADDRESS="core-testnet1.stellar.org"
HISTORY="curl -sf http:https://history.stellar.org/prd/core-testnet/core_testnet_001/{0} -o {1}"

[[VALIDATORS]]
NAME="sdf_testnet_2"
HOME_DOMAIN="testnet.stellar.org"
PUBLIC_KEY="GCUCJTIYXSOXKBSNFGNFWW5MUQ54HKRPGJUTQFJ5RQXZXNOLNXYDHRAP"
ADDRESS="core-testnet2.stellar.org"
HISTORY="curl -sf http:https://history.stellar.org/prd/core-testnet/core_testnet_002/{0} -o {1}"

[[VALIDATORS]]
NAME="sdf_testnet_3"
HOME_DOMAIN="testnet.stellar.org"
PUBLIC_KEY="GC2V2EFSXN6SQTWVYA5EPJPBWWIMSD2XQNKUOHGEKB535AQE2I6IXV2Z"
ADDRESS="core-testnet3.stellar.org"
HISTORY="curl -sf http:https://history.stellar.org/prd/core-testnet/core_testnet_003/{0} -o {1}"
125 changes: 125 additions & 0 deletions services/horizon/docker/ledgerexporter/ledgerexporter.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,125 @@
# this file contains the ledgerexporter deployment and it's config artifacts.
#
# when applying the manifest on a cluster, make sure to include namespace destination,
# as the manifest does not specify namespace, otherwise it'll go in your current kubectl context.
#
# make sure to set the secrets values, substitue <base64 encoded value here> placeholders.
#
# $ kubectl apply -f ledgerexporter.yml -n horizon-dev
apiVersion: v1
kind: ConfigMap
metadata:
annotations:
fluxcd.io/ignore: "true"
labels:
app: ledgerexporter
name: ledgerexporter-pubnet-env
data:
# when using core 'on disk', the earliest ledger to get streamed out after catchup to 2, is 3
# whereas on in-memory it streas out 2, adjusted here, otherwise horizon ingest will abort
# and stop process with error that ledger 3 is not <= expected ledger of 2.
START: "0"
END: "0"

# can only have CONTINUE or START set, not both.
CONTINUE: "true"
WRITE_LATEST_PATH: "true"
sreuland marked this conversation as resolved.
Show resolved Hide resolved
CAPTIVE_CORE_USE_DB: "true"

# configure the network to export
HISTORY_ARCHIVE_URLS: "https://history.stellar.org/prd/core-live/core_live_001,https://history.stellar.org/prd/core-live/core_live_002,https://history.stellar.org/prd/core-live/core_live_003"
NETWORK_PASSPHRASE: "Public Global Stellar Network ; September 2015"
# can refer to canned cfg's for pubnet and testnet which are included on the image
# `/captive-core-pubnet.cfg` or `/captive-core-testnet.cfg`.
# If exporting a standalone network, then mount a volume to the pod container with your standalone core's .cfg,
# and set full path to that volume here
CAPTIVE_CORE_CONFIG: "/captive-core-pubnet.cfg"

# example of testnet network config.
# HISTORY_ARCHIVE_URLS: "https://history.stellar.org/prd/core-testnet/core_testnet_001,https://history.stellar.org/prd/core-testnet/core_testnet_002"
# NETWORK_PASSPHRASE: "Test SDF Network ; September 2015"
# CAPTIVE_CORE_CONFIG: "/captive-core-testnet.cfg"

# provide the url for the external s3 bucket to be populated
# update the ledgerexporter-pubnet-secret to have correct aws key/secret for access to the bucket
ARCHIVE_TARGET: "s3:https://horizon-ledgermeta-prodnet-test"
---
apiVersion: v1
kind: Secret
metadata:
labels:
app: ledgerexporter
name: ledgerexporter-pubnet-secret
type: Opaque
data:
AWS_REGION: <base64 encoded value here>
Copy link
Contributor Author

@sreuland sreuland Jul 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AWS credentials get loaded into the cluster as secrets, which deployment pulls into ledgerexporter container as env variables.

AWS_ACCESS_KEY_ID: <base64 encoded value here>
AWS_SECRET_ACCESS_KEY: <base64 encoded value here>
---
# running captive core with on-disk mode limits RAM to around 2G usage, but
# requires some dedicated disk storage space that has at least 3k IOPS for read/write.
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: ledgerexporter-pubnet-core-storage
spec:
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 500Gi
storageClassName: default
volumeMode: Filesystem
---
apiVersion: apps/v1
kind: Deployment
metadata:
annotations:
fluxcd.io/ignore: "true"
deployment.kubernetes.io/revision: "3"
labels:
app: ledgerexporter-pubnet
name: ledgerexporter-pubnet-deployment
spec:
selector:
matchLabels:
app: ledgerexporter-pubnet
replicas: 1
template:
metadata:
annotations:
fluxcd.io/ignore: "true"
# if we expect to add metrics at some point to ledgerexporter
# this just needs to be set to true
prometheus.io/port: "6060"
prometheus.io/scrape: "false"
labels:
app: ledgerexporter-pubnet
spec:
containers:
- envFrom:
- secretRef:
name: ledgerexporter-pubnet-secret
- configMapRef:
name: ledgerexporter-pubnet-env
image: stellar/horizon-ledgerexporter:latest
imagePullPolicy: Always
name: ledgerexporter-pubnet
resources:
limits:
cpu: 3
memory: 8Gi
requests:
cpu: 500m
memory: 2Gi
volumeMounts:
- mountPath: /cc
name: core-storage
dnsPolicy: ClusterFirst
volumes:
- name: core-storage
persistentVolumeClaim:
claimName: ledgerexporter-pubnet-core-storage



18 changes: 13 additions & 5 deletions services/horizon/docker/ledgerexporter/start
Original file line number Diff line number Diff line change
@@ -1,12 +1,19 @@
#! /usr/bin/env bash
set -e

START="${START:=0}"
START="${START:=2}"
END="${END:=0}"
CONTINUE="${CONTINUE:=false}"
# Writing to /latest is disabled by default to avoid race conditions between parallel container runs
WRITE_LATEST_PATH="${WRITE_LATEST_PATH:=false}"

# config defaults to pubnet core, any other network requires setting all 3 of these in container env
NETWORK_PASSPHRASE="${NETWORK_PASSPHRASE:=Public Global Stellar Network ; September 2015}"
HISTORY_ARCHIVE_URLS="${HISTORY_ARCHIVE_URLS:=https://s3-eu-west-1.amazonaws.com/history.stellar.org/prd/core-live/core_live_001}"
CAPTIVE_CORE_CONFIG="${CAPTIVE_CORE_CONFIG:=/captive-core-pubnet.cfg}"

CAPTIVE_CORE_USE_DB="${CAPTIVE_CORE_USE_DB:=true}"

if [ -z "$ARCHIVE_TARGET" ]; then
echo "error: undefined ARCHIVE_TARGET env variable"
exit 1
Expand Down Expand Up @@ -39,9 +46,10 @@ fi
echo "START: $START END: $END"

export TRACY_NO_INVARIANT_CHECK=1
/ledgerexporter --target $ARCHIVE_TARGET \
--captive-core-toml-path /captive-core-pubnet.cfg \
--history-archive-urls 'https://history.stellar.org/prd/core-live/core_live_001' --network-passphrase 'Public Global Stellar Network ; September 2015' \
--continue="$CONTINUE" --write-latest-path="$WRITE_LATEST_PATH" --start-ledger "$START" --end-ledger "$END"
/ledgerexporter --target "$ARCHIVE_TARGET" \
--captive-core-toml-path "$CAPTIVE_CORE_CONFIG" \
--history-archive-urls "$HISTORY_ARCHIVE_URLS" --network-passphrase "$NETWORK_PASSPHRASE" \
--continue="$CONTINUE" --write-latest-path="$WRITE_LATEST_PATH" \
--start-ledger "$START" --end-ledger "$END" --captive-core-use-db="$CAPTIVE_CORE_USE_DB"

echo "OK"
4 changes: 4 additions & 0 deletions support/storage/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,9 @@ type ConnectOptions struct {

// Wrap the Storage after connection. For example, to add a caching or introspection layer.
Wrap func(Storage) (Storage, error)

// When putting file object to s3 bucket, specify the ACL for the object.
S3WriteACL string
}

func ConnectBackend(u string, opts ConnectOptions) (Storage, error) {
Expand Down Expand Up @@ -60,6 +63,7 @@ func ConnectBackend(u string, opts ConnectOptions) (Storage, error) {
opts.S3Region,
opts.S3Endpoint,
opts.UnsignedRequests,
opts.S3WriteACL,
)

case "gcs":
Expand Down
15 changes: 13 additions & 2 deletions support/storage/s3.go
Original file line number Diff line number Diff line change
Expand Up @@ -12,15 +12,17 @@ import (
"github.com/aws/aws-sdk-go/aws"
"github.com/aws/aws-sdk-go/aws/session"
"github.com/aws/aws-sdk-go/service/s3"
"github.com/aws/aws-sdk-go/service/s3/s3iface"
"github.com/stellar/go/support/errors"
)

type S3Storage struct {
ctx context.Context
svc *s3.S3
svc s3iface.S3API
bucket string
prefix string
unsignedRequests bool
writeACLrule string
}

func NewS3Storage(
Expand All @@ -30,6 +32,7 @@ func NewS3Storage(
region string,
endpoint string,
unsignedRequests bool,
writeACLrule string,
) (Storage, error) {
log.WithFields(log.Fields{"bucket": bucket,
"prefix": prefix,
Expand All @@ -52,6 +55,7 @@ func NewS3Storage(
bucket: bucket,
prefix: prefix,
unsignedRequests: unsignedRequests,
writeACLrule: writeACLrule,
}
return &backend, nil
}
Expand Down Expand Up @@ -139,6 +143,13 @@ func (b *S3Storage) Size(pth string) (int64, error) {
}
}

func (b *S3Storage) GetACLWriteRule() string {
if b.writeACLrule == "" {
return s3.ObjectCannedACLPublicRead
}
return b.writeACLrule
}

func (b *S3Storage) PutFile(pth string, in io.ReadCloser) error {
var buf bytes.Buffer
_, err := buf.ReadFrom(in)
Expand All @@ -150,7 +161,7 @@ func (b *S3Storage) PutFile(pth string, in io.ReadCloser) error {
params := &s3.PutObjectInput{
Bucket: aws.String(b.bucket),
Key: aws.String(key),
ACL: aws.String(s3.ObjectCannedACLPublicRead),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ahhh, I see you changed it here. Is this horizon-light-only code? Otherwise we would need to check whether it breaks something.

Copy link
Contributor Author

@sreuland sreuland Aug 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I was wondering the same on usage, it seems like it may be contained to just the AWS Batch account S3 for horizon light ledgerexport?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the git blame on this line have any meaningful history behind it? I'm pretty sure this storage backend is used by Horizon (though maybe not by SDF's deployments), but I'm not sure.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that ACL has been there since 2017, so, could be used in various ways, @bartekn , do you have insight on types of usages of S3Storage.PutFile, like which SDF S3 buckets has it been used or is it used by customers for their own S3 stuff?

I think I'm going to revert this line, leave that ACL in to avoid unraveling, and instead add ConnectOptions.DisableS3ACL and S3Storage.disableS3ACL , ledgerexporter can then set that to true when invoking ConnectBackend, lmk if any thoughts either way, thx!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I went with that change and added config option to override the S3 File ACL rule applied - 452b20c

bucket owner policy works with ObjectCannedACLBucketOwnerFullControl

Copy link
Contributor

@2opremio 2opremio Aug 4, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good! Will the code still work if we remove ACL support from the bucket? (which is what AWS recommends)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It will if client is the bucket owner, and for any other prinicipals that should have access, adding an inline policy on the bucket with allow statements should provide write and read rules, I added such a policy doc to horizon-ledgermeta-prodnet-test, I noticed horizon-index-prodnet-test had ACLs disabled with owner enforced already, but it doesn't have policy set yet, we can update that for Public(read) and other principals like k8s, which would need read for web and write for single indexer(watching for latest).

ACL: aws.String(b.GetACLWriteRule()),
Body: bytes.NewReader(buf.Bytes()),
}
req, _ := b.svc.PutObjectRequest(params)
Expand Down
Loading