datahub-project / datahub

The Metadata Platform for your Data Stack

Home Page:https://datahubproject.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Error prefixed fuzzy retrieval of the value of key in customProperties (including slashes)

zhangzhaohuazai opened this issue · comments

Describe the bug

I was using the list retrieval interface of my datahub version 0.12.1.5 when I wanted to perform a fuzzy search for the corresponding value of a key (MYPATH) in customProperties. This is my dataset:
image

The query is 'customProperties:MYPATH=/opt/test/myfold', but if the value corresponding to the key contains a left slash, the data cannot be retrieved and the interface reports an error:
`

<title>Error 500 jakarta.servlet.ServletException: Request processing failed: com.datahub.util.exception.ESQueryException: Search query failed:</title>

HTTP ERROR 500 jakarta.servlet.ServletException: Request processing failed: com.datahub.util.exception.ESQueryException: Search query failed:

URI: /openapi/v2/entity/dataset
STATUS: 500
MESSAGE: jakarta.servlet.ServletException: Request processing failed: com.datahub.util.exception.ESQueryException: Search query failed:
SERVLET: openapiServlet
CAUSED BY: jakarta.servlet.ServletException: Request processing failed: com.datahub.util.exception.ESQueryException: Search query failed:
CAUSED BY: com.datahub.util.exception.ESQueryException: Search query failed:
CAUSED BY: OpenSearchStatusException[OpenSearch exception [type=search_phase_execution_exception, reason=all shards failed]]

Caused by:

jakarta.servlet.ServletException: Request processing failed: com.datahub.util.exception.ESQueryException: Search query failed:
	at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:1022)
	at org.springframework.web.servlet.FrameworkServlet.doGet(FrameworkServlet.java:903)
	at jakarta.servlet.http.HttpServlet.service(HttpServlet.java:500)
	at org.springframework.web.servlet.FrameworkServlet.service(FrameworkServlet.java:885)
	at jakarta.servlet.http.HttpServlet.service(HttpServlet.java:587)
	at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:764)
	at org.eclipse.jetty.servlet.ServletHandler$ChainEnd.doFilter(ServletHandler.java:1665)
	at com.datahub.auth.authentication.filter.AuthenticationFilter.doFilter(AuthenticationFilter.java:106)
	at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:202)
	at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1635)
	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:527)
	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:131)
	at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:598)
	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122)
	at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:223)
	at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1580)
	at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:221)
	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1381)
	at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:176)
	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:484)
	at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1553)
	at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:174)
	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1303)
	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:129)
	at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:149)
	at org.eclipse.jetty.server.handler.HandlerList.handle(HandlerList.java:51)
	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:122)
	at org.eclipse.jetty.server.Server.handle(Server.java:563)
	at org.eclipse.jetty.server.HttpChannel$RequestDispatchable.dispatch(HttpChannel.java:1598)
	at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:753)
	at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:501)
	at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:287)
	at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:314)
	at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:100)
	at org.eclipse.jetty.io.SelectableChannelEndPoint$1.run(SelectableChannelEndPoint.java:53)
	at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.runTask(AdaptiveExecutionStrategy.java:421)
	at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.consumeTask(AdaptiveExecutionStrategy.java:390)
	at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.tryProduce(AdaptiveExecutionStrategy.java:277)
	at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.run(AdaptiveExecutionStrategy.java:199)
	at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:411)
	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:969)
	at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.doRunJob(QueuedThreadPool.java:1194)
	at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1149)
	at java.base/java.lang.Thread.run(Thread.java:840)
Caused by: com.datahub.util.exception.ESQueryException: Search query failed:
	at com.linkedin.metadata.search.elasticsearch.query.ESSearchDAO.executeAndExtract(ESSearchDAO.java:204)
	at com.linkedin.metadata.search.elasticsearch.query.ESSearchDAO.scroll(ESSearchDAO.java:422)
	at com.linkedin.metadata.search.elasticsearch.ElasticSearchService.structuredScroll(ElasticSearchService.java:291)
	at com.linkedin.metadata.search.client.CachingEntitySearchService.getRawScrollResults(CachingEntitySearchService.java:379)
	at com.linkedin.metadata.search.client.CachingEntitySearchService.getCachedScrollResults(CachingEntitySearchService.java:315)
	at com.linkedin.metadata.search.client.CachingEntitySearchService.scroll(CachingEntitySearchService.java:131)
	at com.linkedin.metadata.search.SearchService.scrollAcrossEntities(SearchService.java:260)
	at io.datahubproject.openapi.v2.delegates.EntityApiDelegateImpl.scroll(EntityApiDelegateImpl.java:477)
	at io.datahubproject.openapi.v2.generated.controller.DatasetApiController.scroll(DatasetApiController.java:353)
	at jdk.internal.reflect.GeneratedMethodAccessor1561.invoke(Unknown Source)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
	at org.springframework.web.method.support.InvocableHandlerMethod.doInvoke(InvocableHandlerMethod.java:262)
	at org.springframework.web.method.support.InvocableHandlerMethod.invokeForRequest(InvocableHandlerMethod.java:190)
	at org.springframework.web.servlet.mvc.method.annotation.ServletInvocableHandlerMethod.invokeAndHandle(ServletInvocableHandlerMethod.java:118)
	at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.invokeHandlerMethod(RequestMappingHandlerAdapter.java:917)
	at org.springframework.web.servlet.mvc.method.annotation.RequestMappingHandlerAdapter.handleInternal(RequestMappingHandlerAdapter.java:829)
	at org.springframework.web.servlet.mvc.method.AbstractHandlerMethodAdapter.handle(AbstractHandlerMethodAdapter.java:87)
	at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:1089)
	at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:979)
	at org.springframework.web.servlet.FrameworkServlet.processRequest(FrameworkServlet.java:1014)
	... 43 more
Caused by: OpenSearchStatusException[OpenSearch exception [type=search_phase_execution_exception, reason=all shards failed]]
	at org.opensearch.rest.BytesRestResponse.errorFromXContent(BytesRestResponse.java:209)
	at org.opensearch.client.RestHighLevelClient.parseEntity(RestHighLevelClient.java:2235)
	at org.opensearch.client.RestHighLevelClient.parseResponseException(RestHighLevelClient.java:2212)
	at org.opensearch.client.RestHighLevelClient.internalPerformRequest(RestHighLevelClient.java:1931)
	at org.opensearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:1884)
	at org.opensearch.client.RestHighLevelClient.performRequestAndParseEntity(RestHighLevelClient.java:1852)
	at org.opensearch.client.RestHighLevelClient.search(RestHighLevelClient.java:1095)
	at com.linkedin.metadata.search.elasticsearch.query.ESSearchDAO.executeAndExtract(ESSearchDAO.java:195)
	... 63 more
	Suppressed: org.opensearch.client.ResponseException: method [POST], host [http://elasticsearch:9200], URI [/datasetindex_v2/_search?typed_keys=true&max_concurrent_shard_requests=5&search_type=query_then_fetch&batched_reduce_size=512&ccs_minimize_roundtrips=false], status line [HTTP/1.1 400 Bad Request]
{"error":{"root_cause":[{"type":"query_shard_exception","reason":"Failed to parse query [customProperties:MYPATH=*/opt/test/myfold*]","index_uuid":"_d6DHEWDQZ-CsIcca9YHMQ","index":"datasetindex_v2"}],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"query","grouped":true,"failed_shards":[{"shard":0,"index":"datasetindex_v2","node":"hJNQsGN6RXa1ErSSo9CNug","reason":{"type":"query_shard_exception","reason":"Failed to parse query [customProperties:MYPATH=*/opt/test/myfold*]","index_uuid":"_d6DHEWDQZ-CsIcca9YHMQ","index":"datasetindex_v2","caused_by":{"type":"parse_exception","reason":"Cannot parse 'customProperties:MYPATH=*/opt/test/myfold*': Lexical error at line 1, column 43.  Encountered: <EOF> after : \"/myfold*\"","caused_by":{"type":"token_mgr_error","reason":"Lexical error at line 1, column 43.  Encountered: <EOF> after : \"/myfold*\""}}}}]},"status":400}
		at org.opensearch.client.RestClient.convertResponse(RestClient.java:375)
		at org.opensearch.client.RestClient.performRequest(RestClient.java:345)
		at org.opensearch.client.RestClient.performRequest(RestClient.java:320)
		at org.opensearch.client.RestHighLevelClient.internalPerformRequest(RestHighLevelClient.java:1918)
		... 67 more

Powered by Jetty:// 11.0.19
` For this query condition, the front end can not get the result: ![image](https://github.com/datahub-project/datahub/assets/28680957/1773c809-accc-46a8-a5fc-394652386d4b)

To Reproduce
Steps to reproduce the behavior:

  1. Create a dataset whose value in properties contains a slash
  2. By': 8080/openapi/v2/' entity/dataset the query statement for the value of the key value of the interface of prefix matching retrieval:customProperties:MYPATH=/opt/test/myfold
  3. See the error

Expected behavior
The dataset can be retrieved.

Screenshots
image

Desktop (please complete the following information):

  • OS: My datahub is deployed on centos8.5;I used datahub's web interface on windows as well as the postman connection interface
  • Browser [edge]

Additional context
Add any other context about the problem here.

This issue is stale because it has been open for 30 days with no activity. If you believe this is still an issue on the latest DataHub release please leave a comment with the version that you tested it with. If this is a question/discussion please head to https://slack.datahubproject.io. For feature requests please use https://feature-requests.datahubproject.io