Skip to content
Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also or learn more about diff comparisons.

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also . Learn more about diff comparisons here.
base repository: Unstructured-IO/unstructured-api Loading
base: 0.0.70
Choose a base ref
...
head repository: Unstructured-IO/unstructured-api Loading
compare: main
Choose a head ref
  • 11 commits
  • 17 files changed
  • 4 contributors

Commits on Jun 17, 2024

  1. build: replace rockylinux with chainguard/wolfi as a base image (#423)

    ### Summary
    Updates the Dockerfile to use the Chainguard wolfi-base image to reduce
    CVEs. Also adds a step in the docker publish job that scans the images
    and checks for CVEs before publishing.
    
    ### Testing
    Run `make docker-build` and  `make docker-start-api`, then try:
    ```
    from unstructured.partition.api import partition_via_api
    
    elements = partition_via_api(
        filename=filename,
        api_url="http:https://localhost:8000/general/v0/general",
        api_key="<API-KEY>",
        strategy="hi_res",
    )
    
    print("\n\n".join([str(el) for el in elements]))
    ```
    christinestraub committed Jun 17, 2024
    Configuration menu
    Copy the full SHA
    2bdd52a View commit details
    Browse the repository at this point in the history

Commits on Jun 20, 2024

  1. fix: build and push workflow failing due to missing -f option `buil…

    …dx build` command (#425)
    
    I noticed that images on main branch are failing to build (and push) due
    to missing `-f` parameter in `docker buildx build`. By default it
    expects `Dockerfile` to exist, but we only have `Dockerfile-amd64` and
    `Dockerfile-arm64`
    
    
    ![image](https://github.com/Unstructured-IO/unstructured-api/assets/64484917/4527165a-909e-498d-b0ee-8bba4b1a13e4)
    
    ---------
    
    Co-authored-by: christinestraub <[email protected]>
    micmarty-deepsense and christinestraub committed Jun 20, 2024
    Configuration menu
    Copy the full SHA
    e8c6fa9 View commit details
    Browse the repository at this point in the history

Commits on Jun 21, 2024

  1. Configuration menu
    Copy the full SHA
    80a6627 View commit details
    Browse the repository at this point in the history
  2. fix: revert to rockylinux SHA that works (arm64) (#428)

    unnecessary SHA update introduced in
    #427 that needs
    to be reverted
    micmarty-deepsense committed Jun 21, 2024
    Configuration menu
    Copy the full SHA
    d3564b6 View commit details
    Browse the repository at this point in the history
  3. fix: re-add DOCKER_IMAGE env var in Test image step (#429)

    shell syntax error occurs in docker-publish.yml workflow
    micmarty-deepsense committed Jun 21, 2024
    Configuration menu
    Copy the full SHA
    5b604b2 View commit details
    Browse the repository at this point in the history
  4. fix: invalid env var setting in docker-publish workflow (#430)

    bug introduced in previous PR causing build failure on main
    micmarty-deepsense committed Jun 21, 2024
    Configuration menu
    Copy the full SHA
    2f482e8 View commit details
    Browse the repository at this point in the history
  5. Configuration menu
    Copy the full SHA
    d7acffe View commit details
    Browse the repository at this point in the history

Commits on Jun 24, 2024

  1. build(deps): bump dependency versions (#434)

    ### Summary
    
    Bumps dependency versions for the API. Closes #432.
    MthwRobinson committed Jun 24, 2024
    Configuration menu
    Copy the full SHA
    d5a878f View commit details
    Browse the repository at this point in the history

Commits on Jun 28, 2024

  1. fix/Fix MS Office filetype errors and harden docker smoketest (#436)

    # Changes
    **Fix for docx and other office files returning `{"detail":"File type
    None is not supported."}`**
    After moving to the wolfi base image, the `mimetypes` lib no longer
    knows about these file extensions. To avoid issues like this, let's add
    an explicit mapping for all the file extensions we care about. I added a
    `filetypes.py` and moved `get_validated_mimetype` over. When this file
    is imported, we'll call `mimetypes.add_type` for all file extensions we
    support.
    
    **Update smoke test coverage**
    This bug snuck past because we were already providing the mimetype in
    the docker smoke test. I updated `test_happy_path` to test against the
    container with and without passing `content_type`. I added some missing
    filetypes, and sorted the test params by extension so we can see when
    new types are missing.
    
    # Testing
    The new smoke test will verify that all filetypes are working. You can
    also `make docker-build && make docker-start-api`, and test out the docx
    in the sample docs dir. On `main`, this file will give you the error
    above.
    ```
    curl 'http:https://localhost:8000/general/v0/general' \
    --form 'files=@"fake.docx"'
    ```
    awalker4 committed Jun 28, 2024
    Configuration menu
    Copy the full SHA
    6710df0 View commit details
    Browse the repository at this point in the history

Commits on Jul 9, 2024

  1. build(deps): bump to unstructured==0.14.10 (#438)

    ### Summary
    
    Bumps to `unstructured==0.14.10`.
    MthwRobinson committed Jul 9, 2024
    Configuration menu
    Copy the full SHA
    35d5b37 View commit details
    Browse the repository at this point in the history

Commits on Jul 10, 2024

  1. build: move arm64 image to wolfi-base (#433)

    ### Summary
    
    Updates the `arm64` image to use `wolfi-base` instead of `rockylinux`
    and consolidates the `amd64` and `arm64` images into the same
    Dockerfile. As of this PR, the `amd64` and `arm64` images for the API
    are at parity.
    
    ### Testing
    
    Successful docker build on the feature branch can be seen in [this
    job](https://github.com/Unstructured-IO/unstructured-api/actions/runs/9875409234/job/27272072089).
    MthwRobinson committed Jul 10, 2024
    Configuration menu
    Copy the full SHA
    a2d5a5a View commit details
    Browse the repository at this point in the history
Loading