cloudera-labs / hive-sre

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Create Report that identifies "locations" in table defs that don't match `fs.defaultFS`

dstreev opened this issue · comments

This can happen when users have created tables in a non-ha hdfs environment and then went HA without running the metatool.

Should ignore non-hdfs namespaces.

Testing with the following query:

SELECT DB_NAME,
       TBL_NAME,
       LOCATION,
       SERDES.SLIB AS SERDE_LIB
FROM (
         SELECT DBS.DB_ID,
                DBS.NAME AS DB_NAME,
                TBL_NAME,
                LOCATION,
                SERDE_ID
         FROM (
                  SELECT TBL_ID,
                         DB_ID,
                         TBL_NAME,
                         NONDEFAULTLOCS.SD_ID,
                         LOCATION,
                         SERDE_ID
                  FROM (
                           SELECT SD_ID,
                                  LOCATION,
                                  SERDE_ID
                           FROM (
                                    select *
                                    from SDS SDS
                                    WHERE SDS.LOCATION LIKE "hdfs://%"
                                ) hdfsLocs
                           WHERE hdfsLocs.LOCATION LIKE '${DEFAULT_NAMESPACE}%') NONDEFAULTLOCS
                           INNER JOIN TBLS TBLS ON NONDEFAULTLOCS.SD_ID = TBLS.SD_ID) NONDEFAULTTBL
                  INNER JOIN DBS DBS ON NONDEFAULTTBL.DB_ID = DBS.DB_ID) NONDEFAULT
         INNER JOIN SERDES SERDES ON NONDEFAULT.SERDE_ID = SERDES.SERDE_ID;

Closing this issue because this is something that can be done from within Cloudera Manager. In CM, go to the Hive Service. From the "Action" menu select "Update Hive Metastore Namenodes".