Message ID | 20240228105102.3399-1-zajec5@gmail.com |
---|---|
State | Accepted |
Delegated to: | Rafał Miłecki |
Headers | show |
Series | [V3,1/3] base-files: sysupgrade: add tar.sh with helpers for building archives | expand |
Hi, > [...] > This allows building uncompressed tar archives from shell scripts (and > compressing them later if needed) > > Signed-off-by: Rafał Miłecki <rafal@milecki.pl> Signed-off-by: Jo-Philipp Wich <jo@mein.io> ~ Jo
Hello Rafał, Sorry for the late review. I know this will be used for a limited context but it can be constructed in such a way it could be used for more cases without increasing its complexity. > From: Jo-Philipp Wich <jo@mein.io> > > This allows building uncompressed tar archives from shell scripts (and > compressing them later if needed) > > Signed-off-by: Rafał Miłecki <rafal@milecki.pl> > --- > V2: Simplify dd in __tar_print_padding (I still think helper is useful) > Hardcode 0/0/ root/root for now as most likely it'll be enough > Simplify name validation (leasing slash) > Reorder some variables > V3: Fix dd in __tar_print_padding > Rename functions > Drop unused functions > Document usage > > package/base-files/files/lib/upgrade/tar.sh | 71 +++++++++++++++++++++ > 1 file changed, 71 insertions(+) > create mode 100644 package/base-files/files/lib/upgrade/tar.sh > > diff --git a/package/base-files/files/lib/upgrade/tar.sh b/package/base-files/files/lib/upgrade/tar.sh > new file mode 100644 > index 0000000000..a9d1d559e6 > --- /dev/null > +++ b/package/base-files/files/lib/upgrade/tar.sh > @@ -0,0 +1,71 @@ > +# SPDX-License-Identifier: GPL-2.0-or-later OR MIT > + > +# Example usage: > +# > +# { > +# tar_print_member "date.txt" "It's $(date +"%Y")" > +# tar_print_trailer > +# } > test.tar > + > +__tar_print_padding() { > + dd if=/dev/zero bs=1 count=$1 2>/dev/null > +} > + > +tar_print_member() { > + local name="$1" > + local content="$2" > + local mtime="${3:-$(date +%s)}" > + local mode=644 > + local uid=0 > + local gid=0 > + local size=${#content} > + local type=0 > + local link="" > + local username="root" > + local groupname="root" > + > + # 100 byte of padding bytes, using 0x01 since the shell does not tolerate null bytes in strings > + local pad=$'\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1' \1 is still a valid filename character. Shell cannot save a string with \0 but printf can handle it nicely (as well as pipes). If you build the header using a function at the moment you use it, you could avoid the \1 hack. print_header() { local filename=$1 local mode=$2 ... local checksum=$7 ... printf '%s' "$filename"; printf '\x00%.0s' $(seq $((100 - ${#filename}))) ... } print_header "$filename" "$mode" \ "$uid" "$gid" .. | hexdump -ve '1/1 "%u If you don't like the printf trick to repeat a char, you could use __tar_print_padding (or use printf for both). You could call it twice, first without a checksum to calculate it and then a new one to generate the final output. > + > + # validate name (strip leading slash if present) > + name=${name#/} > + > + # truncate string header values to their maximum length > + name=${name:0:100} It feels strange to simply truncate the name. If you need to truncate it, you are implying it might be larger. In that case, it should fail if it cannot fit a specific filename. Otherwise, you could end up overwriting two filenames with the same first 100 bytes. And BTW, shouldn't ustar support 255 or 256 filenames? > + link=${link:0:100} > + username=${username:0:32} > + groupname=${groupname:0:32} > + > + # construct header part before checksum field > + local header1="${name}${pad:0:$((100 - ${#name}))}" > + header1="${header1}$(printf '%07d\1' $mode)" > + header1="${header1}$(printf '%07o\1' $uid)" > + header1="${header1}$(printf '%07o\1' $gid)" > + header1="${header1}$(printf '%011o\1' $size)" > + header1="${header1}$(printf '%011o\1' $mtime)" > + > + # construct header part after checksum field > + local header2="$(printf '%d' $type)" > + header2="${header2}${link}${pad:0:$((100 - ${#link}))}" > + header2="${header2}ustar ${pad:0:1}" > + header2="${header2}${username}${pad:0:$((32 - ${#username}))}" > + header2="${header2}${groupname}${pad:0:$((32 - ${#groupname}))}" It seems to be a strange ustar format. Shouldn't it have an ustar version before the username? And Device major/minor after groupname, followed by more 150 filename prefix? It might still work if the reader interprets it as a v7. > + # calculate checksum over header fields > + local checksum=0 > + for byte in $(printf '%s%8s%s' "$header1" "" "$header2" | tr '\1' '\0' | hexdump -ve '1/1 "%u "'); do > + checksum=$((checksum + byte)) > + done > + Maybe use a single line like: checksum=$(print header... | hexdump -ve '1/1 "%u\n"' | awk '{s+=$1} END {print s}') ? > + # print member header, padded to 512 byte > + printf '%s%06o\0 %s' "$header1" $checksum "$header2" | tr '\1' '\0' > + __tar_print_padding 183 > + > + # print content data, padded to multiple of 512 byte > + printf "%s" "$content" > + __tar_print_padding $((512 - (size % 512))) > +} > + > +tar_print_trailer() { > + __tar_print_padding 1024 > +} > -- > 2.35.3 >
diff --git a/package/base-files/files/lib/upgrade/tar.sh b/package/base-files/files/lib/upgrade/tar.sh new file mode 100644 index 0000000000..a9d1d559e6 --- /dev/null +++ b/package/base-files/files/lib/upgrade/tar.sh @@ -0,0 +1,71 @@ +# SPDX-License-Identifier: GPL-2.0-or-later OR MIT + +# Example usage: +# +# { +# tar_print_member "date.txt" "It's $(date +"%Y")" +# tar_print_trailer +# } > test.tar + +__tar_print_padding() { + dd if=/dev/zero bs=1 count=$1 2>/dev/null +} + +tar_print_member() { + local name="$1" + local content="$2" + local mtime="${3:-$(date +%s)}" + local mode=644 + local uid=0 + local gid=0 + local size=${#content} + local type=0 + local link="" + local username="root" + local groupname="root" + + # 100 byte of padding bytes, using 0x01 since the shell does not tolerate null bytes in strings + local pad=$'\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1\1' + + # validate name (strip leading slash if present) + name=${name#/} + + # truncate string header values to their maximum length + name=${name:0:100} + link=${link:0:100} + username=${username:0:32} + groupname=${groupname:0:32} + + # construct header part before checksum field + local header1="${name}${pad:0:$((100 - ${#name}))}" + header1="${header1}$(printf '%07d\1' $mode)" + header1="${header1}$(printf '%07o\1' $uid)" + header1="${header1}$(printf '%07o\1' $gid)" + header1="${header1}$(printf '%011o\1' $size)" + header1="${header1}$(printf '%011o\1' $mtime)" + + # construct header part after checksum field + local header2="$(printf '%d' $type)" + header2="${header2}${link}${pad:0:$((100 - ${#link}))}" + header2="${header2}ustar ${pad:0:1}" + header2="${header2}${username}${pad:0:$((32 - ${#username}))}" + header2="${header2}${groupname}${pad:0:$((32 - ${#groupname}))}" + + # calculate checksum over header fields + local checksum=0 + for byte in $(printf '%s%8s%s' "$header1" "" "$header2" | tr '\1' '\0' | hexdump -ve '1/1 "%u "'); do + checksum=$((checksum + byte)) + done + + # print member header, padded to 512 byte + printf '%s%06o\0 %s' "$header1" $checksum "$header2" | tr '\1' '\0' + __tar_print_padding 183 + + # print content data, padded to multiple of 512 byte + printf "%s" "$content" + __tar_print_padding $((512 - (size % 512))) +} + +tar_print_trailer() { + __tar_print_padding 1024 +}