Skip to content

#132 allow users to optionally provide an output buffer when calling transcode#136

Merged
mkitti merged 26 commits into
JuliaIO:masterfrom
baumgold:master
Apr 9, 2023
Merged

#132 allow users to optionally provide an output buffer when calling transcode#136
mkitti merged 26 commits into
JuliaIO:masterfrom
baumgold:master

Conversation

@baumgold

Copy link
Copy Markdown
Contributor

An alternative to #134 to solve #132 without code-duplication.

@Moelf

Moelf commented Mar 15, 2023

Copy link
Copy Markdown

I like this better

@svilupp

svilupp commented Mar 15, 2023

Copy link
Copy Markdown

Yeah, this is great. I was solving for minimum changes to the original code path, but this is much better/cleaner!

I’ll close my PR

@baumgold

Copy link
Copy Markdown
Contributor Author

Is anyone with write-access available to review this PR so we can get it approved/merged/released? Thanks!

@baumgold

Copy link
Copy Markdown
Contributor Author

@mkitti / @quinnj - are either of you able to review this? Thanks.

@mkitti mkitti left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add tests for all the new signatures. See if you can consolidate the docstring into one.

Comment thread src/transcode.jl
Comment thread src/transcode.jl
Comment thread src/transcode.jl Outdated
Comment thread src/transcode.jl Outdated
Comment thread src/transcode.jl Outdated
@quinnj quinnj requested a review from Drvi March 23, 2023 18:19
@quinnj

quinnj commented Mar 23, 2023

Copy link
Copy Markdown
Member

I'm going to request a review from @Drvi as well; will try to review myself a little later today

@baumgold

Copy link
Copy Markdown
Contributor Author

@mkitti - Thanks for your review. I believe I addressed all of your suggestions. Could you please review again? Thanks.

@baumgold baumgold requested review from mkitti and removed request for Drvi March 23, 2023 20:00
Comment thread src/transcode.jl
Co-authored-by: Mark Kittisopikul <mkitti@users.noreply.github.com>
@baumgold baumgold requested a review from mkitti March 25, 2023 10:53

@mkitti mkitti left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Let's give the others a chance to review. Ping me in a week.

Comment thread src/transcode.jl
@baumgold

baumgold commented Apr 3, 2023

Copy link
Copy Markdown
Contributor Author

@quinnj / @Drvi / @mkitti - any new thoughts/comments? If not, shall we merge? Thanks.

@mkitti mkitti left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more thought is that documenting the argument as ByteData might not be very useful. Rather we should just spell it out as Union{Vector{UInt8},Base.CodeUnits{UInt8}}.

Comment thread src/transcode.jl Outdated
Comment thread src/transcode.jl Outdated
Comment thread src/transcode.jl Outdated
baumgold and others added 5 commits April 3, 2023 09:42
Co-authored-by: Mark Kittisopikul <mkitti@users.noreply.github.com>
Co-authored-by: Mark Kittisopikul <mkitti@users.noreply.github.com>
Co-authored-by: Mark Kittisopikul <mkitti@users.noreply.github.com>
@mkitti

mkitti commented Apr 3, 2023

Copy link
Copy Markdown
Member

Looks good to me. Will merge soon unless someone else speaks up.

@JoaoAparicio

JoaoAparicio commented Apr 3, 2023

Copy link
Copy Markdown
Contributor

May I make a comment...

I note that as it is, there's no method with signature

transcode(::Type{TranscodingStreams.Codec}, ::Vector{UInt8}, ::Vector{UInt8})

There's

transcode(::Type{TranscodingStreams.Codec}, ::Vector{UInt8})

and there's

transcode(::Codec, ::Vector{UInt8}, ::Vector{UInt8})

As you can see the number of possibilities for manually defined methods grows combinatorially.

I would suggest that we fully resolve first the first argument, then the second etc. Then the combinations grow linearly. In this way we need only 3 manually defined methods to cover all combinations (right now there are 4 and not all combinations are covered)

Comment thread src/transcode.jl Outdated
Comment thread src/transcode.jl Outdated
Comment thread src/transcode.jl Outdated
Comment thread src/transcode.jl
@baumgold baumgold requested a review from mkitti April 8, 2023 02:02
Comment thread src/buffer.jl Outdated
Comment thread src/transcode.jl
@mkitti

mkitti commented Apr 8, 2023

Copy link
Copy Markdown
Member

I'm feeling pretty good about where we are on this. I'm aiming to merge within the next 24 hours.

@mkitti mkitti self-requested a review April 8, 2023 02:24
@mkitti mkitti self-assigned this Apr 8, 2023
@mkitti

mkitti commented Apr 9, 2023

Copy link
Copy Markdown
Member

I see no further comments or objections. Merging.

Please submit another pull request or issue to iterate further.

I will arrange for a release within the next five days. I'm currently targeting Tuesday.

I believe this to be a non-breaking change, but I will review further later.

@mkitti mkitti merged commit 3ece750 into JuliaIO:master Apr 9, 2023
@JoaoAparicio

Copy link
Copy Markdown
Contributor

@mkitti this is where this will be used: apache/arrow-julia#422

@mkitti

mkitti commented Apr 11, 2023

Copy link
Copy Markdown
Member

Released as 0.9.12

@quinnj

quinnj commented Apr 11, 2023

Copy link
Copy Markdown
Member

Sorry for not being able to help review earlier. I think this is awesome. One note I want to follow up on: it seems that currently "bad things" happen when you try to pass in the same buffer as the input and output buffers to transcode. Can we add a validation check and throw an error in that case? (I just tried it on a 20mb stream and it blew up to 17.5gb before erroring!)

Or even better: can that be made to work? i.e. I have a use case in CloudStore.jl where people want to provide pre-allocated buffers in which to download a compressed file and then be able to decompress the data "in place". So they'd provide a big enough buffer for the final decompressed data, download the compressed data into the first part of the buffer, then do a decompress in place; is that possible?

@baumgold

Copy link
Copy Markdown
Contributor Author

Can we add a validation check and throw an error in that case? (I just tried it on a 20mb stream and it blew up to 17.5gb before erroring!)

#140

Or even better: can that be made to work? i.e. I have a use case in CloudStore.jl where people want to provide pre-allocated buffers in which to download a compressed file and then be able to decompress the data "in place". So they'd provide a big enough buffer for the final decompressed data, download the compressed data into the first part of the buffer, then do a decompress in place; is that possible?

I don't know if this is possible. Is there data to show that this is a performance bottleneck? Can you just allocate a separate buffer?

bauglir added a commit to bauglir/TranscodingStreams.jl that referenced this pull request Apr 14, 2023
The examples in many of the codec packages rely on being able to call
`transcode` providing a `Codec` type and a string[^1][^2]. JuliaIO#136 removed
type annotations from the trailing arguments for `transcode(::Type{C},
...) where {C<:Codec}` causing a method ambiguity with `transcode(T,
src::String)` in Julia `Base`[^3]. This adds an additional method to
`Base.transcode` to resolve this ambiguity.

Fixes JuliaIO#139.

[^1]: https://github.com/JuliaIO/CodecZlib.jl/tree/f9fddaa28c093c590a7a93358709df2945306bc7#usage
[^2]: https://github.com/JuliaIO/CodecZstd.jl/tree/6327ffa9a3a12fc46d465dcfc8b30bed91cf284b#usage
[^3]: https://github.com/JuliaLang/julia/blob/ff7b8eb00bf887f20bf57fb7e53be0070a242c07/base/c.jl#L306
bauglir added a commit to bauglir/TranscodingStreams.jl that referenced this pull request Apr 15, 2023
The examples in many of the codec packages rely on being able to call
`transcode` providing a `Codec` type and a string[^1][^2]. JuliaIO#136 removed
type annotations from the trailing arguments for `transcode(::Type{C},
...) where {C<:Codec}` causing a method ambiguity with `transcode(T,
src::String)` in Julia `Base`[^3]. This adds an additional method to
`Base.transcode` to resolve this ambiguity.

Fixes JuliaIO#139.

[^1]: https://github.com/JuliaIO/CodecZlib.jl/tree/f9fddaa28c093c590a7a93358709df2945306bc7#usage
[^2]: https://github.com/JuliaIO/CodecZstd.jl/tree/6327ffa9a3a12fc46d465dcfc8b30bed91cf284b#usage
[^3]: https://github.com/JuliaLang/julia/blob/ff7b8eb00bf887f20bf57fb7e53be0070a242c07/base/c.jl#L306
mkitti pushed a commit that referenced this pull request Apr 20, 2023
The examples in many of the codec packages rely on being able to call
`transcode` providing a `Codec` type and a string[^1][^2]. #136 removed
type annotations from the trailing arguments for `transcode(::Type{C},
...) where {C<:Codec}` causing a method ambiguity with `transcode(T,
src::String)` in Julia `Base`[^3]. This adds an additional method to
`Base.transcode` to resolve this ambiguity.

Fixes #139.

[^1]: https://github.com/JuliaIO/CodecZlib.jl/tree/f9fddaa28c093c590a7a93358709df2945306bc7#usage
[^2]: https://github.com/JuliaIO/CodecZstd.jl/tree/6327ffa9a3a12fc46d465dcfc8b30bed91cf284b#usage
[^3]: https://github.com/JuliaLang/julia/blob/ff7b8eb00bf887f20bf57fb7e53be0070a242c07/base/c.jl#L306
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants