Index index by Group index by Distribution index by Vendor index by creation date index by Name Mirrors Help Search

apache-parquet-utils-15.0.2-1.1 RPM for armv6hl

From OpenSuSE Ports Tumbleweed for armv6hl

Name: apache-parquet-utils Distribution: openSUSE Tumbleweed
Version: 15.0.2 Vendor: openSUSE
Release: 1.1 Build date: Sat Mar 23 16:23:23 2024
Group: Productivity/Scientific/Math Build host: reproducible
Size: 148033 Source RPM: apache-arrow-15.0.2-1.1.src.rpm
Packager: http://bugs.opensuse.org
Url: https://arrow.apache.org/
Summary: Development platform for in-memory data - development files
Apache Arrow is a cross-language development platform for in-memory
data. It specifies a standardized language-independent columnar memory
format for flat and hierarchical data, organized for efficient
analytic operations on modern hardware. It also provides computational
libraries and zero-copy streaming messaging and interprocess
communication.

This package provides utilities for working with the Parquet format.

Provides

Requires

License

Apache-2.0 AND BSD-3-Clause AND BSD-2-Clause AND MIT

Changelog

* Sat Mar 23 2024 Ben Greiner <code@bnavigator.de>
  - Update to 15.0.2
    [#]# Bug Fixes
    * [C++][Acero] Increase size of Acero TempStack (#40007)
    * [C++][Dataset] Add missing Protobuf static link dependency
      (#40015)
    * [C++] Possible data race when reading metadata of a parquet
      file (#40111)
    * [C++] Make span SFINAE standards-conforming to enable
      compilation with nvcc (#40253)
* Wed Feb 28 2024 Ben Greiner <code@bnavigator.de>
  - Reenable logging
    * Add apache-arrow-pr40230-glog-0.7.patch
    * Add apache-arrow-pr40275-glog-0.7-2.patch
    * now requires glog devel files to be present for
      apache-arrow-devel; ArrowConfig.cmake fails otherwise
    * gh#apache/arrow#40181
    * gh#apache/arrow#40230
    * gh#apache/arrow#40275
* Fri Feb 23 2024 Ben Greiner <code@bnavigator.de>
  - Update to 15.0.1
    [#]# Bug Fixes
    * [C++] "iso_calendar" kernel returns incorrect results for array
      length > 32 (#39360)
    * [C++] Explicit error in ExecBatchBuilder when appending var
      length data exceeds offset limit (int32 max) (#39383)
    * [C++][Parquet] Pass memory pool to decoders (#39526)
    * [C++][Parquet] Validate page sizes before truncating to int32
      (#39528)
    * [C++] Fix tail-word access cross buffer boundary in
      `CompareBinaryColumnToRow` (#39606)
    * [C++] Fix the issue of ExecBatchBuilder when appending
      consecutive tail rows with the same id may exceed buffer
      boundary (for fixed size types) (#39585)
    * [Release] Update platform tags for macOS wheels to macosx_10_15
      (#39657)
    * [C++][FlightRPC] Fix nullptr dereference in PollInfo (#39711)
    * [C++] Fix tail-byte access cross buffer boundary in key hash
      avx2 (#39800)
    * [C++][Acero] Fix AsOfJoin with differently ordered schemas than
      the output (#39804)
    * [C++] Expression ExecuteScalarExpression execute empty args
      function with a wrong result (#39908)
    * [C++] Strip extension metadata when importing a registered
      extension (#39866)
    * [C#] Restore support for .NET 4.6.2 (#40008)
    * [C++] Fix out-of-line data size calculation in
      BinaryViewBuilder::AppendArraySlice (#39994)
    * [C++][CI][Parquet] Fixing parquet column_writer_test building
      (#40175)
    [#]# New Features and Improvements
    * [C++] PollFlightInfo does not follow rule of 5
    * [C++] Fix filter and take kernel for month_day_nano intervals
      (#39795)
    * [C++] Thirdparty: Bump zlib to 1.3.1 (#39877)
    * [C++] Add missing "#include <algorithm>" (#40010)
  - Release 15.0.0
    [#]# Bug Fixes
    * [C++] Bring back case_when tests for union types (#39308)
    * [C++] Fix the issue of ExecBatchBuilder when appending
      consecutive tail rows with the same id may exceed buffer
      boundary (#39234)
    * [C++][Python] Add a no-op kernel for
      dictionary_encode(dictionary) (#38349)
    * [C++] Use the latest tagged version of flatbuffers (#38192)
    * [C++] Don't use MSVC_VERSION to determin
    - fms-compatibility-version (#36595)
    * [C++] Optimize hash kernels for Dictionary ChunkedArrays
      (#38394)
    * [C++][Gandiva] Avoid registering exported functions multiple
      times in gandiva (#37752)
    * [C++][Acero] Fix race condition caused by straggling input in
      the as-of-join node (#37839)
    * [C++][Parquet] add more closed file checks for
      ParquetFileWriter (#38390)
    * [C++][FlightRPC] Add missing app_metadata arguments (#38231)
    * [C++][Parquet] Fix Valgrind memory leak in
      arrow-dataset-file-parquet-encryption-test (#38306)
    * [C++][Parquet] Don't initialize OpenSSL explicitly with OpenSSL
      1.1 (#38379)
    * [C++] Re-generate flatbuffers C++ for Skyhook (#38405)
    * [C++] Avoid passing null pointer to LZ4 frame decompressor
      (#39125)
    * [C++] Add missing explicit size_t cast for i386 (#38557)
    * [C++] Fix: add TestingEqualOptions for gtest functions.
      (#38642)
    * [C++][Gandiva] Use arrow io util to replace
      std::filesystem::path in gandiva (#38698)
    * [C++] Protect against PREALLOCATE preprocessor defined on macOS
      (#38760)
    * [C++] Check variadic buffer counts in bounds (#38740)
    * [C++][FS][Azure] Do nothing for CreateDir("/container", true)
      (#38783)
    * Fix TestArrowReaderAdHoc.ReadFloat16Files to use new
      uncompressed files (#38825)
    * [C++] S3FileSystem export s3 sdk config
      "use_virtual_addressing" to arrow::fs::S3Options (#38858)
    * [C++][Gandiva] Fix Gandiva to_date function's validation for
      supress errors parameter (#38987)
    * [C++][Parquet] Fix spelling (#38959)
    * [C++] Fix spelling (acero) (#38961)
    * [C++] Fix spelling (compute) (#38965)
    * [C++] Fix spelling (util) (#38967)
    * [C++] Fix spelling (dataset) (#38969)
    * [C++] Fix spelling (filesystem) (#38972)
    * [C++] Fix spelling (#38978)
    * [C++] Fix spelling (#38980)
    * [C++][Acero] union node output batches should be unordered
      (#39046)
    * [C++][CI] Fix Valgrind failures (#39127)
    * [C++] Remove needless system Protobuf dependency with
    - DARROW_HDFS=ON (#39137)
    * [C++][Compute] Fix negative duration division (#39158)
    * [C++] Add missing data copy in StreamDecoder::Consume(data)
      (#39164)
    * [C++] Remove compiler warnings with -Wconversion
    - Wno-sign-conversion in public headers (#39186)
    * [C++][Benchmarking] Remove hardcoded min times (#39307)
    * [C++] Don't use "if constexpr" in lambda (#39334)
    * [C++] Disable -Werror=attributes for Azure SDK's identity.hpp
      (#39448)
    * [C++] Fix compile warning (#39389)
    * [CI][JS] Force node 20 on JS build on arm64 to fix build issues
      (#39499)
    * [C++] Disable parallelism for jemalloc external project
      (#39522)
    * [C++][Parquet] Fix crash in test_parquet_dataset_lazy_filtering
      (#39632)
    * [C++] Disable parallelism for all `make`-based externalProjects
      when CMake >= 3.28 is used
    [#]#  New Features and Improvements
    * [C++][JSON] Change the max rows to Unlimited(int_32) (#38582)
    * [C++][Python] Add "Z" to the end of timestamp print string when
      tz defined (#39272)
    * [C++][Python] DLPack implementation for Arrow Arrays (producer)
      (#38472)
    * [C++] Diffing of Run-End Encoded arrays (#35003)
    * [C++][Python][R] Allow users to adjust S3 log level by
      environment variable (#38267)
    * [C++][Format] Implementation of the LIST_VIEW and
      LARGE_LIST_VIEW array formats (#35345)
    * [C++] Use Cast() instead of CastTo() for Scalar in test
      (#39044)
    * [C++][Python][Parquet] Implement Float16 logical type (#36073)
    * [C++] Add Utf8View and BinaryView to the c ABI (#38443)
    * [C++][Parquet] Add api to get RecordReader from RowGroupReader
      (#37003)
    * [C++] Expose a span converter for Buffer and ArraySpan (#38027)
    * [C++] Add A Dictionary Compaction Function For DictionaryArray
      (#37418)
    * [C++] Add arrow::ipc::StreamDecoder::Reset() (#37970)
    * [C++] Implement file reads for Azure filesystem (#38269)
    * [C++][Integration] Add C++ Utf8View implementation (#37792)
    * [C++][Gandiva] Add external function registry support (#38116)
    * [C++][Gandiva] Migrate LLVM JIT engine from MCJIT to ORC
      v2/LLJIT (#39098)
    * [C++] Feature: support concatenate recordbatches. (#37896)
    * [C++] Add support for specifying custom Array opening and
      closing delimiters to arrow::PrettyPrintDelimiters (#38187)
    * [R] Allow code() to return package name prefix. (#38144)
    * [C++][Benchmark] Add non-stream Codec Compression/Decompression
      (#38067)
    * [C++][Parquet] Change DictEncoder dtor checking to warning log
      (#38118)
    * [C++][Parquet] Support reading parquet files with multiple gzip
      members (#38272)
    * [C++][Parquet] check the decompressed page size same as size in
      page header (#38327)
    * [C++][Azure] Use properties for input stream metadata (#38524)
    * [C++][FS][Azure] Implement file writes (#38780)
    * [C++] Implement GetFileInfo for a single file in Azure
      filesystem (#38505)
    * [C++][CMake] Use transitive dependency for system GoogleTest
      (#38340)
    * [C++][Parquet] Use new encrypted files for page index
      encryption test (#38347)
    * Add validation logic for offsets and values to
      arrow.array.ListArray.fromArrays (#38531)
    * [C++][Acero] Create a sorted merge node (#38380)
    * [C++][Benchmark] Adding benchmark for LZ4/Snappy Compression
      (#38453)
    * [C++] Support LogicalNullCount for DictionaryArray (#38681)
    * [C++][Parquet] Faster scalar BYTE_STREAM_SPLIT (#38529)
    * [C++][Gandiva] Support registering external C functions
      (#38632)
    * [C++] Implement GetFileInfo(selector) for Azure filesystem
      (#39009)
    * [C++][FS][Azure] Implement CreateDir() (#38708)
    * [C++][FS][Azure] Implement DeleteDir() (#38793)
    * [C++][FS][Azure] Implement DeleteDirContents() (#38888)
    * [C++] : Implement AzureFileSystem::DeleteRootDirContents
      (#39151)
    * [C++][FS][Azure] Implement CopyFile() (#39058)
    * [C++][Go][Parquet] Add tests for reading Float16 files in
      parquet-testing (#38753)
    * [C++][FS][Azure] Rename AzurePath to AzureLocation (#38773)
    * [C++] Implement directory semantics even when the storage
      account doesn't support HNS (#39361)
    * [C++][Parquet] Update parquet.thrift to sync with 2.10.0
      (#38815)
    * [C++] Replace "#ifdef ARROW_WITH_GZIP" in dataset test to
      ARROW_WITH_ZLIB (#38853)
    * [C++][Parquet] Using length to optimize bloom filter read
      (#38863)
    * [C++][Parquet] Minor: making parquet TypedComparator operation
      as const method (#38875)
    * [C++] DatasetWriter release rows_in_flight_throttle when
      allocate writing failed (#38885)
    * [C++][Parquet] Move EstimatedBufferedValueBytes from
      TypedColumnWriter to ColumnWriter (#39055)
    * [C++] Stop installing internal bpacking_simd* headers (#38908)
    * [C++][Gandiva] Refactor function holder to return arrow Result
      (#38873)
    * [C++] Use Cast() instead of CastTo() for Dictionary Scalar in
      test (#39362)
    * [C++] Use Cast() instead of CastTo() for Timestamp Scalar in
      test (#39060)
    * [C++] Use Cast() instead of CastTo() for List Scalar in test
      (#39353)
    * [C++][Parquet] Support row group filtering for nested paths for
      struct fields (#39065)
    * [C++] Refactor the Azure FS tests and filesystem class
      instantiation (#39207)
    * [C++][Parquet] Optimize FLBA record reader (#39124)
    * Create module info compiler plugin (#39135)
    * [C++] : Try to make Buffer::device_type_ non-optional (#39150)
    * [C++][Parquet] Remove deprecated AppendRowGroup(int64_t
      num_rows) (#39209)
    * [C++][Parquet] Avoid WriteRecordBatch from produce zero-sized
      RowGroup (#39211)
    * [C++] Support binary to fixed_size_binary cast (#39236)
    * [C++][Azure][FS] Add default credential auth configuration
      (#39263)
    * [C++] Don't install bundled Azure SDK for C++ with CMake 3.28+
      (#39269)
    * [C++][FS] : Remove the AzureBackend enum and add more flexible
      connection options (#39293)
    * [C++][FS] : Inform caller of container not-existing when
      checking for HNS support (#39298)
    * [C++][FS][Azure] Add workload identity auth configuration
      (#39319)
    * [C++][FS][Azure] Add managed identity auth configuration
      (#39321)
    * [C++] Forward arguments to ExceptionToStatus all the way to
      Status::FromArgs (#39323)
    * [C++] Flaky DatasetWriterTestFixture.MaxRowsOneWriteBackpresure
      test (#39379)
    * [C++] Add ForceCachedHierarchicalNamespaceSupport to help with
      testing (#39340)
    * [C++][FS][Azure] Add client secret auth configuration (#39346)
    * [C++] Reduce function.h includes (#39312)
    * [C++] Use Cast() instead of CastTo() for Parquet (#39364)
    * [C++][Parquet] Vectorize decode plain on FLBA (#39414)
    * [C++][Parquet] Style: Using arrow::Buffer data_as api rather
      than reinterpret_cast (#39420)
    * [C++][ORC] Upgrade ORC to 1.9.2 (#39431)
    * [C++] Use default Azure credentials implicitly and support
      anonymous credentials explicitly (#39450)
    * [C++][Parquet] Allow reading dictionary without reading data
      via ByteArrayDictionaryRecordReader (#39153)
  - Disable logging until compatibility with glog is restored
    gh#apache/arrow#40181
* Mon Jan 15 2024 Ben Greiner <code@bnavigator.de>
  - Update to 14.0.2
    [#]# New Features and Improvements
    * GH-38449 - [Release][Go][macOS] Use local test data if possible
      (#38450)
    * GH-38591 - [Parquet][C++] Remove redundant open calls in
      ParquetFileFormat::GetReaderAsync (#38621)
    [#]# Bug Fixes
    * GH-38345 - [Release] Use local test data for verification if
      possible (#38362)
    * GH-38438 - [C++] Dataset: Trying to fix the async bug in
      Parquet dataset (#38466)
    * GH-38577 - Reading parquet file behavior change from 13.0.0 to
      14.0.0
    * GH-38618 - [C++] S3FileSystem: fix regression in deleting
      explicitly created sub-directories (#38845)
    * GH-38861 - [C++] Add missing “-framework Security” to
      Libs.private in arrow.pc (#38869)
    * GH-39072 - [Release][CI] Python3.11-devel is required for the
      verification job on AlmaLinux 8 (#39073)
    * GH-39074 - [Release][Packaging] Use UTF-8 explicitly for KEYS
      (#39082)
* Thu Jan 11 2024 pgajdos@suse.com
  - disable some tests for s390x [bsc#1218592]
* Mon Nov 13 2023 Ondřej Súkup <mimi.vx@gmail.com>
  - update 14.0.1
    * GH-38431 - [Python][CI] Update fs.type_name checks for s3fs tests
    * GH-38607 - [Python] Disable PyExtensionType autoload
  - update to 14.0.1
    * very long list of changes can be found here:
    https://arrow.apache.org/release/14.0.0.html
* Fri Aug 25 2023 Ben Greiner <code@bnavigator.de>
  - Update to 13.0.0
    [#]# Acero
    * Handling of unaligned buffers is input nodes can be configured
      programmatically or by setting the environment variable
      ACERO_ALIGNMENT_HANDLING. The default behavior is to warn when
      an unaligned buffer is detected GH-35498.
    [#]# Compute
    * Several new functions have been added:
    - aggregate functions “first”, “last”, “first_last” GH-34911;
    - vector functions “cumulative_prod”, “cumulative_min”,
      “cumulative_max” GH-32190;
    - vector function “pairwise_diff” GH-35786.
    * Sorting now works on dictionary arrays, with a much better
      performance than the naive approach of sorting the decoded
      dictionary GH-29887. Sorting also works on struct arrays, and
      nested sort keys are supported using FieldRed GH-33206.
    * The check_overflow option has been removed from
      CumulativeSumOptions as it was redundant with the availability
      of two different functions: “cumulative_sum” and
      “cumulative_sum_checked” GH-35789.
    * Run-end encoded filters are efficiently supported GH-35749.
    * Duration types are supported with the “is_in” and “index_in”
      functions GH-36047. They can be multiplied with all integer
      types GH-36128.
    * “is_in” and “index_in” now cast their inputs more flexibly:
      they first attempt to cast the value set to the input type,
      then in the other direction if the former fails GH-36203.
    * Multiple bugs have been fixed in “utf8_slice_codeunits” when
      the stop option is omitted GH-36311.
    [#]# Dataset
    * A custom schema can now be passed when writing a dataset
      GH-35730. The custom schema can alter nullability or metadata
      information, but is not allowed to change the datatypes
      written.
    [#]# Filesystems
    * The S3 filesystem now writes files in equal-sized chunks, for
      compatibility with Cloudflare’s “R2” Storage GH-34363.
    * A long-standing issue where S3 support could crash at shutdown
      because of resources still being alive after S3 finalization
      has been fixed GH-36346. Now, attempts to use S3 resources
      (such as making filesystem calls) after S3 finalization should
      result in a clean error.
    * The GCS filesystem accepts a new option to set the project id
      GH-36227.
    [#]# IPC
    * Nullability and metadata information for sub-fields of map
      types is now preserved when deserializing Arrow IPC GH-35297.
    [#]# Orc
    * The Orc adapter now maps Arrow field metadata to Orc type
      attributes when writing, and vice-versa when reading GH-35304.
    [#]# Parquet
    * It is now possible to write additional metadata while a
      ParquetFileWriter is open GH-34888.
    * Writing a page index can be enabled selectively per-column
      GH-34949. In addition, page header statistics are not written
      anymore if the page index is enabled for the given column
      GH-34375, as the information would be redundant and less
      efficiently accessed.
    * Parquet writer properties allow specifying the sorting columns
      GH-35331. The user is responsible for ensuring that the data
      written to the file actually complies with the given sorting.
    * CRC computation has been implemented for v2 data pages
      GH-35171. It was already implemented for v1 data pages.
    * Writing compliant nested types is now enabled by default
      GH-29781. This should not have any negative implication.
    * Attempting to load a subset of an Arrow extension type is now
      forbidden GH-20385. Previously, if an extension type’s storage
      is nested (for example a “Point” extension type backed by a
      struct<x: float64, y: float64>), it was possible to load
      selectively some of the columns of the storage type.
    [#]# Substrait
    * Support for various functions has been added: “stddev”,
      “variance”, “first”, “last” (GH-35247, GH-35506).
    * Deserializing sorts is now supported GH-32763. However, some
      features, such as clustered sort direction or custom sort
      functions, are not implemented.
    [#]# Miscellaneous
    * FieldRef sports additional methods to get a flattened version
      of nested fields GH-14946. Compared to their non-flattened
      counterparts, the methods GetFlattened, GetAllFlattened,
      GetOneFlattened and GetOneOrNoneFlattened combine a child’s
      null bitmap with its ancestors’ null bitmaps such as to compute
      the field’s overall logical validity bitmap.
    * In other words, given the struct array [null, {'x': null},
      {'x': 5}], FieldRef("x")::Get might return [0, null, 5] while
      FieldRef("y")::GetFlattened will always return [null, null, 5].
    * Scalar::hash() has been fixed for sliced nested arrays
      GH-35360.
    * A new floating-point to decimal conversion algorithm exhibits
      much better precision GH-35576.
    * It is now possible to cast between scalars of different
      list-like types GH-36309.
* Mon Jun 12 2023 Ben Greiner <code@bnavigator.de>
  - Update to 12.0.1
    * [GH-35423] - [C++][Parquet] Parquet PageReader Force
      decompression buffer resize smaller (#35428)
    * [GH-35498] - [C++] Relax EnsureAlignment check in Acero from
      requiring 64-byte aligned buffers to requiring value-aligned
      buffers (#35565)
    * [GH-35519] - [C++][Parquet] Fixing exception handling in parquet
      FileSerializer (#35520)
    * [GH-35538] - [C++] Remove unnecessary status.h include from
      protobuf (#35673)
    * [GH-35730] - [C++] Add the ability to specify custom schema on a
      dataset write (#35860)
    * [GH-35850] - [C++] Don't disable optimization with
      RelWithDebInfo (#35856)
  - Drop cflags.patch -- fixed upstream
* Thu May 18 2023 Ben Greiner <code@bnavigator.de>
  - Update to 12.0.0
    * Run-End Encoded Arrays have been implemented and are accessible
      (GH-32104)
    * The FixedShapeTensor Logical value type has been implemented
      using ExtensionType (GH-15483, GH-34796)
    [#]# Compute
    * New kernel to convert timestamp with timezone to wall time
      (GH-33143)
    * Cast kernels are now built into libarrow by default (GH-34388)
    [#]# Acero
    * Acero has been moved out of libarrow into it’s own shared
      library, allowing for smaller builds of the core libarrow
      (GH-15280)
    * Exec nodes now can have a concept of “ordering” and will reject
      non-sensible plans (GH-34136)
    * New exec nodes: “pivot_longer” (GH-34266), “order_by”
      (GH-34248) and “fetch” (GH-34059)
    * Breaking Change: Reorder output fields of “group_by” node so
      that keys/segment keys come before aggregates (GH-33616)
    [#]# Substrait
    * Add support for the round function GH-33588
    * Add support for the cast expression element GH-31910
    * Added API reference documentation GH-34011
    * Added an extension relation to support segmented aggregation
      GH-34626
    * The output of the aggregate relation now conforms to the spec
      GH-34786
    [#]# Parquet
    * Added support for DeltaLengthByteArray encoding to the Parquet
      writer (GH-33024)
    * NaNs are correctly handled now for Parquet predicate push-downs
      (GH-18481)
    * Added support for reading Parquet page indexes (GH-33596) and
      writing page indexes (GH-34053)
    * Parquet writer can write columns in parallel now (GH-33655)
    * Fixed incorrect number of rows in Parquet V2 page headers
      (GH-34086)
    * Fixed incorrect Parquet page null_count when stats are disabled
      (GH-34326)
    * Added support for reading BloomFilters to the Parquet Reader
      (GH-34665)
    * Parquet File-writer can now add additional key-value metadata
      after it has been opened (GH-34888)
    * Breaking Change: The default row group size for the Arrow
      writer changed from 64Mi rows to 1Mi rows. GH-34280
    [#]# ORC
    * Added support for the union type in ORC writer (GH-34262)
    * Fixed ORC CHAR type mapping with Arrow (GH-34823)
    * Fixed timestamp type mapping between ORC and arrow (GH-34590)
    [#]# Datasets
    * Added support for reading JSON datasets (GH-33209)
    * Dataset writer now supports specifying a function callback to
      construct the file name in addition to the existing file name
      template (GH-34565)
    [#]# Filesystems
    * GcsFileSystem::OpenInputFile avoids unnecessary downloads
      (GH-34051)
    [#]# Other changes
    * Convenience Append(std::optional...) methods have been added to
      array builders
      ([GH-14863](https://github.com/apache/arrow/issues/14863))
    * A deprecated OpenTelemetry header was removed from the Flight
      library (GH-34417)
    * Fixed crash in “take” kernels on ExtensionArrays with an
      underlying dictionary type (GH-34619)
    * Fixed bug where the C-Data bridge did not preserve nullability
      of map values on import (GH-34983)
    * Added support for EqualOptions to RecordBatch::Equals
      (GH-34968)
    * zstd dependency upgraded to v1.5.5 (GH-34899)
    * Improved handling of “logical” nulls such as with union and
      RunEndEncoded arrays (GH-34361)
    * Fixed incorrect handling of uncompressed body buffers in IPC
      reader, added IpcWriteOptions::min_space_savings for optional
      compression optimizations (GH-15102)
* Mon Apr 03 2023 Andreas Schwab <schwab@suse.de>
  - cflags.patch: fix option order to compile with optimisation
  - Adjust constraints
* Wed Mar 29 2023 Ben Greiner <code@bnavigator.de>
  - Remove gflags-static. It was only needed due to a packaging error
    with gflags which is about to be fixed in Tumbleweed
  - Disable build of the jemalloc memory pool backend
    * It requires every consuming application to LD_PRELOAD
      libjemalloc.so.2, even when it is not set as the default memory
      pool, due to static TLS block allocation errors
    * Usage of the bundled jemalloc as a workaround is not desired
      (gh#apache/arrow#13739)
    * jemalloc does not seem to have a clear advantage over the
      system glibc allocator:
      https://ursalabs.org/blog/2021-r-benchmarks-part-1
    * This overrides the default behavior documented in
      https://arrow.apache.org/docs/cpp/memory.html#default-memory-pool
* Sun Mar 12 2023 Ben Greiner <code@bnavigator.de>
  - Update to v11.0.0
    * ARROW-4709 - [C++] Optimize for ordered JSON fields (#14100)
    * ARROW-11776 - [C++][Java] Support parquet write from ArrowReader
      to file (#14151)
    * ARROW-13938 - [C++] Date and datetime types should autocast from
      strings
    * ARROW-14161 - [C++][Docs] Improve Parquet C++ docs (#14018)
    * ARROW-14999 - [C++] Optional field name equality checks for map
      and list type (#14847)
    * ARROW-15538 - [C++] Expanding coverage of math functions from
      Substrait to Acero (#14434)
    * ARROW-15592 - [C++] Add support for custom output field names in
      a substrait::PlanRel (#14292)
    * ARROW-15732 - [C++] Do not use any CPU threads in execution plan
      when use_threads is false (#15104)
    * ARROW-16782 - [Format] Add REE definitions to FlatBuffers
      (#14176)
    * ARROW-17144 - [C++][Gandiva] Add sqrt function (#13656)
    * ARROW-17301 - [C++] Implement compute function "binary_slice"
      (#14550)
    * ARROW-17509 - [C++] Simplify async scheduler by removing the
      need to call End (#14524)
    * ARROW-17520 - [C++] Implement SubStrait SetRel (UnionAll)
      (#14186)
    * ARROW-17610 - [C++] Support additional source types in
      SourceNode (#14207)
    * ARROW-17613 - [C++] Add function execution API for a
      preconfigured kernel (#14043)
    * ARROW-17640 - [C++] Add File Handling Test cases for GlobFile
      handling in Substrait Read (#14132)
    * ARROW-17798 - [C++][Parquet] Add DELTA_BINARY_PACKED encoder to
      Parquet writer (#14191)
    * ARROW-17825 - [C++] Allow the possibility to write several
      tables in ORCFileWriter (#14219)
    * ARROW-17836 - [C++] Allow specifying alignment of buffers
      (#14225)
    * ARROW-17837 - [C++][Acero] Create ExecPlan-owned QueryContext
      that will store a plan's shared data structures (#14227)
    * ARROW-17859 - [C++] Use self-pipe in signal-receiving StopSource
      (#14250)
    * ARROW-17867 - [C++][FlightRPC] Expose bulk parameter binding in
      Flight SQL (#14266)
    * ARROW-17932 - [C++] Implement streaming RecordBatchReader for
      JSON (#14355)
    * ARROW-17960 - [C++][Python] Implement list_slice kernel (#14395)
    * ARROW-17966 - [C++] Adjust to new format for Substrait optional
      arguments (#14415)
    * ARROW-17975 - [C++] Create at-fork facility (#14594)
    * ARROW-17980 - [C++] As-of-Join Substrait extension (#14485)
    * ARROW-17989 - [C++][Python] Enable struct_field kernel to accept
      string field names (#14495)
    * ARROW-18008 - [Python][C++] Add use_threads to
      run_substrait_query
    * ARROW-18051 - [C++] Enable tests skipped by ARROW-16392 (#14425)
    * ARROW-18095 - [CI][C++][MinGW] All tests exited with 0xc0000139
    * ARROW-18113 - [C++] Add RandomAccessFile::ReadManyAsync (#14723)
    * ARROW-18135 - [C++] Avoid warnings that ExecBatch::length may be
      uninitialized (#14480)
    * ARROW-18144 - [C++] Improve JSONTypeError error message in
      testing (#14486)
    * ARROW-18184 - [C++] Improve JSON parser benchmarks (#14552)
    * ARROW-18206 - [C++][CI] Add a nightly build for C++20
      compilation (#14571)
    * ARROW-18235 - [C++][Gandiva] Fix the like function
      implementation for escape chars (#14579)
    * ARROW-18249 - [C++] Update vcpkg port to arrow 10.0.0
    * ARROW-18253 - [C++][Parquet] Add additional bounds safety checks
      (#14592)
    * ARROW-18259 - [C++][CMake] Add support for system Thrift CMake
      package (#14597)
    * ARROW-18280 - [C++][Python] Support slicing to end in list_slice
      kernel (#14749)
    * ARROW-18282 - [C++][Python] Support step >= 1 in list_slice
      kernel (#14696)
    * ARROW-18287 - [C++][CMake] Add support for Brotli/utf8proc
      provided by vcpkg (#14609)
    * ARROW-18342 - [C++] AsofJoinNode support for Boolean data field
      (#14658)
    * ARROW-18350 - [C++] Use std::to_chars instead of std::to_string
      (#14666)
    * ARROW-18367 - [C++] Enable the creation of named table relations
      (#14681)
    * ARROW-18373 - Fix component drop-down, add license text (#14688)
    * ARROW-18377 - MIGRATION: Automate component labels from issue
      form content (#15245)
    * ARROW-18395 - [C++] Move select-k implementation into separate
      module
    * ARROW-18402 - [C++] Expose DeclarationInfo (#14765)
    * ARROW-18406 - [C++] Can't build Arrow with Substrait on Ubuntu
      20.04 (#14735)
    * ARROW-18409 - [GLib][Plasma] Suppress deprecated warning in
      building plasma-glib (#14739)
    * ARROW-18413 - [C++][Parquet] Expose page index info from
      ColumnChunkMetaData (#14742)
    * ARROW-18419 - [C++] Update vendored fast_float (#14817)
    * ARROW-18420 - [C++][Parquet] Introduce ColumnIndex & OffsetIndex
      (#14803)
    * ARROW-18421 - [C++][ORC] Add accessor for stripe information in
      reader (#14806)
    * ARROW-18427 - [C++] Support negative tolerance in AsofJoinNode
      (#14934)
    * ARROW-18435 - [C++][Java] Update ORC to 1.8.1 (#14942)
    * GH-14869 - [C++] Add Cflags.private defining _STATIC to .pc.in.
      (#14900)
    * GH-14920 - [C++][CMake] Add missing -latomic to Arrow CMake
      package (#15251)
    * GH-14937 - [C++] Add rank kernel benchmarks (#14938)
    * GH-14951 - [C++][Parquet] Add benchmarks for DELTA_BINARY_PACKED
      encoding (#15140)
    * GH-15072 - [C++] Move the round functionality into a separate
      module (#15073)
    * GH-15074 - [Parquet][C++] change 16-bit page_ordinal to 32-bit
      (#15182)
    * GH-15096 - [C++] Substrait ProjectRel Emit Optimization (#15097)
    * GH-15100 - [C++][Parquet] Add benchmark for reading strings from
      Parquet (#15101)
    * GH-15151 - [C++] Adding RecordBatchReaderSource to solve an
      issue in R API (#15183)
    * GH-15185 - [C++][Parquet] Improve documentation for Parquet
      Reader column_indices (#15184)
    * GH-15199 - [C++][Substrait] Allow
      AGGREGATION_INVOCATION_UNSPECIFIED as valid invocation (#15198)
    * GH-15200 - [C++] Created benchmarks for round kernels. (#15201)
    * GH-15216 - [C++][Parquet] Parquet writer accepts RecordBatch
      (#15240)
    * GH-15226 - [C++] Add DurationType to hash kernels (#33685)
    * GH-15237 - [C++] Add ::arrow::Unreachable() using
      std::string_view (#15238)
    * GH-15239 - [C++][Parquet] Parquet writer writes decimal as
      int32/64 (#15244)
    * GH-15290 - [C++][Compute] Optimize IfElse kernel AAS/ASA case
      when the scalar is null (#15291)
    * GH-33607 - [C++] Support optional additional arguments for
      inline visit functions (#33608)
    * GH-33657 - [C++] arrow-dataset.pc doesn't depend on parquet.pc
      without ARROW_PARQUET=ON (#33665)
    * PARQUET-2179 - [C++][Parquet] Add a test for skipping repeated
      fields (#14366)
    * PARQUET-2188 - [parquet-cpp] Add SkipRecords API to RecordReader
      (#14142)
    * PARQUET-2204 - [parquet-cpp] TypedColumnReaderImpl::Skip should
      reuse scratch space (#14509)
    * PARQUET-2206 - [parquet-cpp] Microbenchmark for ColumnReader
      ReadBatch and Skip (#14523)
    * PARQUET-2209 - [parquet-cpp] Optimize skip for the case that
      number of values to skip equals page size (#14545)
    * PARQUET-2210 - [C++][Parquet] Skip pages based on header
      metadata using a callback (#14603)
    * PARQUET-2211 - [C++] Print ColumnMetaData.encoding_stats field
      (#14556)
  - Remove unused python3-arrow package declaration
    * Add options as recommended for python support
  - Provide test data for unittests
  - Don't use system jemalloc but bundle it in order to avoid
    static TLS errors in consuming packages like python-pyarrow
    * gh#apache/arrow#13739
* Sun Aug 28 2022 Stefan Brüns <stefan.bruens@rwth-aachen.de>
  - Revert ccache change, using ccache in a pristine buildroot
    just slows down OBS builds (use --ccache for local builds).
  - Remove unused gflags-static-devel dependency.
* Mon Aug 22 2022 John Vandenberg <jayvdb@gmail.com>
  - Speed up builds with ccache
* Sat Aug 06 2022 Stefan Brüns <stefan.bruens@rwth-aachen.de>
  - Update to v9.0.0
    No (current) changelog provided
  - Spec file cleanup:
    * Remove lots of duplicate, unused, or wrong build dependencies
    * Do not package outdated Readmes and Changelogs
  - Enable tests, disable ones requiring external test data

Files

/usr/bin/parquet-dump-schema
/usr/bin/parquet-reader
/usr/bin/parquet-scan
/usr/share/doc/packages/apache-parquet-utils
/usr/share/doc/packages/apache-parquet-utils/README.md
/usr/share/licenses/apache-parquet-utils
/usr/share/licenses/apache-parquet-utils/LICENSE.txt
/usr/share/licenses/apache-parquet-utils/NOTICE.txt
/usr/share/licenses/apache-parquet-utils/header


Generated by rpm2html 1.8.1

Fabrice Bellet, Sat Nov 16 01:18:36 2024