Skip to content

Conversation

@zac-li
Copy link
Member

@zac-li zac-li commented Jan 15, 2024

Goals

Similar to #6046, we need to use jina to serve as a fastAPI for GCP.

Bunch of setup codes are copied, as Sagemaker and GCP requirements are mostly the same, except that GCP requires API request and response in certain format:

Request:
https://cloud.google.com/vertex-ai/docs/predictions/custom-container-requirements#request_requirements
Response:
https://cloud.google.com/vertex-ai/docs/predictions/custom-container-requirements#response_requirements

TODO

  • Resolve the docarray validation issue
  • Finish the unittests
@github-actions github-actions bot added size/M area/core This issue/PR affects the core codebase area/helper This issue/PR affects the helper functionality area/testing This issue/PR affects testing labels Jan 15, 2024
raise HTTPException(status_code=499, detail=status.description)
else:
return {"predictions": resp.docs}
return output_model(predictions=resp.docs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the first return is not good

client.containers.prune()


class chdir(AbstractContextManager):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need this? I do not like chdir behavior in tests



def test_provider_gcp_pod_inference():
with chdir(os.path.join(os.path.dirname(__file__), 'SampleExecutor')):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do not use chdir, pass the proper path

@codecov
Copy link

codecov bot commented Jan 16, 2024

Codecov Report

Attention: 108 lines in your changes are missing coverage. Please review.

Comparison is base (be263b1) 75.06% compared to head (b258808) 47.90%.
Report is 1 commits behind head on master.

Files Patch % Lines
jina/serve/runtimes/worker/http_gcp_app.py 0.00% 93 Missing ⚠️
jina/serve/runtimes/worker/request_handling.py 9.09% 10 Missing ⚠️
jina/serve/runtimes/servers/http.py 70.00% 3 Missing ⚠️
jina/serve/runtimes/asyncio.py 33.33% 2 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##           master    #6136       +/-   ##
===========================================
- Coverage   75.06%   47.90%   -27.17%     
===========================================
  Files         152      151        -1     
  Lines       14061    14133       +72     
===========================================
- Hits        10555     6770     -3785     
- Misses       3506     7363     +3857     
Flag Coverage Δ
jina 47.90% <10.74%> (-27.17%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

from jina.serve.runtimes.gateway.models import _to_camel_case

if not docarray_v2:
logger.warning('Only docarray v2 is supported with GCP. ')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would this raise an Exception at some point? it should

allow_methods=['*'],
allow_headers=['*'],
)
logger.warning('CORS is enabled. This service is accessible from any website!')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why warning?

return await process(input_model(data=[]))
else:
raise HTTPException(
status_code=400,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please use constants and not magic numbers.

@zac-li
Copy link
Member Author

zac-li commented Jan 17, 2024

hey @JoanFM , this is a WIP PR, so I will address your comments later. They are mostly copied from #6046, so I will keep them consistent for now.

if status.code == jina_pb2.StatusProto.ERROR:
raise HTTPException(status_code=499, detail=status.description)
else:
return VertexAIResponse(predictions=req.data.docs)
Copy link
Member Author

@zac-li zac-li Feb 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder why I got this error when running pytest -v -s -x test_gcp.py -k test_provider_gcp_deployment_inference, looks like it's failing on some docarray validation

  File "pydantic/main.py", line 449, in pydantic.main.BaseModel.dict
  File "pydantic/main.py", line 868, in _iter
  File "pydantic/main.py", line 794, in pydantic.main.BaseModel._get_value
  File "/Users/zacli/Code/jina/env/lib/python3.9/site-packages/docarray/array/doc_list/doc_list.py", line 135, in __init__
    super().__init__(docs)
  File "/Users/zacli/Code/jina/env/lib/python3.9/site-packages/docarray/array/doc_list/doc_list.py", line 163, in _validate_docs
    yield self._validate_one_doc(doc)
  File "/Users/zacli/Code/jina/env/lib/python3.9/site-packages/docarray/array/doc_list/doc_list.py", line 170, in _validate_one_doc
    raise ValueError(f'{doc} is not a {self.doc_type}')
ValueError: {'id': '5d814334ba65cd66b356f363262dd2ad', 'text': 'hello world', 'embeddings': NdArray([[0.72778929, 0.23900942, 0.03286786, 0.95298338, 0.77614308,
          0.03805365, 0.52810364, 0.70904448, 0.03221386, 0.40657516,
          0.05665328, 0.73416605, 0.04400287, 0.79719694, 0.42839442,
          0.43580189, 0.07334091, 0.60751732, 0.10342848, 0.51660762,
          0.06586024, 0.88129511, 0.42874586, 0.88633873, 0.31858046,
          0.71453457, 0.31093969, 0.92729799, 0.25257687, 0.92740828,
          0.61681301, 0.51788409, 0.00997884, 0.21210052, 0.91713272,
          0.37283237, 0.53357638, 0.77158797, 0.96699236, 0.13919022,
          0.92879329, 0.78211408, 0.65285461, 0.89399441, 0.42854324,
          0.94649221, 0.1966403 , 0.39643899, 0.42736609, 0.73116549,
          0.88923411, 0.9894739 , 0.9124282 , 0.83614148, 0.27662465,
          0.69740887, 0.34939623, 0.07094385, 0.38615401, 0.04148779,
          0.0133029 , 0.68877819, 0.60318039, 0.64945002]])} is not a <class 'executor.EmbeddingResponseModel'>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/core This issue/PR affects the core codebase area/helper This issue/PR affects the helper functionality area/testing This issue/PR affects testing size/M

4 participants