scala - AWS S3 Java SDK: RequestClientOptions.setReadLimit -

if consider s3 upload code

val tm: transfermanager = ??? val putrequest = new putobjectrequest(bucketname, keyname, inputstream, metadata) putrequest.setstorageclass(storageclass) putrequest.getrequestclientoptions.setreadlimit(100000) tm.upload(putrequest)

what use of setreadlimit method? aws sdk javadoc contains following description:

sets optional mark-and-reset read limit used signing , retry purposes. see also: inputstream.mark(int)

is assumption correct in provide kind of "checkpointing", such if network fails in middle of upload process, api (internally) perform retry last "marked" position instead of beginning of file?

the transfermanager have support "checkpointing" describe, although it's not directly related readlimit parameter. s3 allows upload large objects in multiple parts, , transfermanager automatically takes care of doing uploads on certain size. if upload of single part fails, underlying amazons3client needs retry upload of individual part. if pass transfermanager file instead of inputstream, can upload multiple parts of file in parallel speed transfer.

the readlimit parameter used when pass transfermanager (or underlying amazons3client) inputstream instead of file. compared file, can seek around in if need retry part of upload, inputstream interface more restrictive. in order support retries on inputstream uploads, amazons3client uses mark , reset methods of inputstream interface, marking stream @ beginning of each upload , reseting mark if needs retry.

notice mark method takes readlimit parameter, , obligated "remember" many bytes inputstream ask in advance. inputstreams implement mark allocating new byte[readlimit] buffer underlying data in memory can replayed if reset called, makes dangerous blindly mark using length of object uploaded (which might several gigabytes). instead, amazons3client defaults calling mark value of 128kb - if inputstream cares readlimit, means amazons3client won't able retry requests fail after has sent more first 128kb.

if you're using such inputstream , dedicate more memory buffering uploaded data amazons3client can retry on failures further along in upload (or conversely if you'd use smaller buffer , potentially see more failures), can tune value gets used via setreadlimit.

Fun enginering

Search This Blog

scala - AWS S3 Java SDK: RequestClientOptions.setReadLimit -

Comments

Post a Comment