Commandline Help (Options)

Flags	Argument	Description
-accept, --accept		Accept ALL confirmations and silence prompts
-ap, --acid-partition-count		Set the limit of partitions that the ACID strategy will work with. '-1' means no-limit.
-asm, --avro-schema-migration		Migrate AVRO Schema Files referenced in TBLPROPERTIES by 'avro.schema.url'. Without migration it is expected that the file will exist on the other cluster and match the 'url' defined in the schema DDL. If it's not present, schema creation will FAIL. Specifying this option REQUIRES the LEFT and RIGHT cluster to be LINKED. See docs: https://github.com/cloudera-labs/hms-mirror#linking-clusters-storage-layers
-at, --auto-tune		Auto-tune Session Settings for SELECT's and DISTRIBUTION for Partition INSERT's.
-c, --concurrency		Set application concurrency, default is 10
-cfg, --config		Config with details for the HMS-Mirror. Default: $HOME/.hms-mirror/cfg/default.yaml
-cine, --create-if-not-exist		CREATE table/partition statements will be adjusted to include 'IF NOT EXISTS'. This will ensure all remaining sql statements will be run. This can be used to sync partition definitions for existing tables.
-com, --comment		Comment to add to report output
-cs, --common-storage		Common Storage used with Data Strategy HYBRID, SQL, EXPORT_IMPORT. This will change the way these methods are implemented by using the specified storage location as an 'common' storage point between two clusters. In this case, the cluster do NOT need to be 'linked'. Each cluster DOES need to have access to the location and authorization to interact with the location. This may mean additional configuration requirements for 'hdfs' to ensure this seamless access.
-cto, --compress-text-output		Data movement (SQL/STORAGE_MIGRATION) of TEXT based file formats will be compressed in the new table.
-d, --data-strategy		Specify how the data will follow the schema. [DUMP, SCHEMA_ONLY, LINKED, SQL, EXPORT_IMPORT, HYBRID, CONVERT_LINKED, STORAGE_MIGRATION, COMMON, ICEBERG_CONVERSION]
-da, --downgrade-acid		Downgrade ACID tables to EXTERNAL tables with purge.
-db, --database		Comma separated list of Databases (upto 100).
-dbo, --database-only		Migrate the Database definitions as they exist from LEFT to RIGHT
-dbp, --db-prefix		Optional: A prefix to add to the RIGHT cluster DB Name. Usually used for testing.
-dbr, --db-rename		Optional: Rename target db to ... This option is only valid when '1' database is listed in `-db`.
-dbRegEx, --database-regex		RegEx of Database to include in process.
-dbsp, --database-skip-properties		Comma separated list of database properties (regex) to skip during the migration process. This will prevent the property from being set on the target cluster.
-dc, --distcp		Build the 'distcp' workplans. Optional argument (PULL, PUSH) to define which cluster is running the distcp commands. Default is PULL.
-dms, --data-movement-strategy		Specify how the data will follow the schema. [SQL, DISTCP, MANUAL]
-dp, --decrypt-password		Used this in conjunction with '-pkey' to decrypt the generated passcode from `-p`.
-ds, --dump-source		Specify which 'cluster' is the source for the DUMP strategy (LEFT\\|RIGHT).
-dtd, --dump-test-data		Used to dump a data set that can be feed into the process for testing.
-e, --execute		Execute actions request, without this flag the process is a dry-run.
-ep, --export-partition-count		Set the limit of partitions that the EXPORT_IMPORT strategy will work with.
-epl, --evaluate-partition-location		For SCHEMA_ONLY and DUMP data-strategies, review the partition locations and build partition metadata calls to create them is they can't be located via 'MSCK'.
-ewd, --external-warehouse-directory		The external warehouse directory path. Should not include the namespace OR the database directory. This will be used to set the LOCATION database option.
-f, --flip		Flip the definitions for LEFT and RIGHT. Allows the same config to be used in reverse.
-fel, --force-external-location		Under some conditions, the LOCATION element for EXTERNAL tables is removed (ie: -rdl). In which case we rely on the settings of the database definition to control the EXTERNAL table data location. But for some older Hive versions, the LOCATION element in the database is NOT honored. Even when the database LOCATION is set, the EXTERNAL table LOCATION defaults to the system wide warehouse settings. This flag will ensure the LOCATION element remains in the CREATE definition of the table to force it's location.
-glm, --global-location-map	<key=value>	Comma separated key=value pairs of Locations to Map. IE: /myorig/data/finance=/data/ec/finance. This reviews 'EXTERNAL' table locations for the path '/myorig/data/finance' and replaces it with '/data/ec/finance'. Option can be used alone or with -rdl. Only applies to 'EXTERNAL' tables and if the tables location doesn't contain one of the supplied maps, it will be translated according to -rdl rules if -rdl is specified. If -rdl is not specified, the conversion for that table is skipped.
-h, --help		Help
-ip, --in-place		Downgrade ACID tables to EXTERNAL tables with purge.
-is, --intermediate-storage		Intermediate Storage used with Data Strategy HYBRID, SQL, EXPORT_IMPORT. This will change the way these methods are implemented by using the specified storage location as an intermediate transfer point between two clusters. In this case, the cluster do NOT need to be 'linked'. Each cluster DOES need to have access to the location and authorization to interact with the location. This may mean additional configuration requirements for 'hdfs' to ensure this seamless access.
-itpo, --iceberg-table-property-overrides	<key=value>	Comma separated key=value pairs of Iceberg Table Properties to set/override.
-iv, --iceberg-version		Specify the Iceberg Version to use. Specify 1 or 2. Default is 2.
-ltd, --load-test-data		Use the data saved by the `-dtd` option to test the process.
-ma, --migrate-acid	<bucket-threshold (2)>	Migrate ACID tables (if strategy allows). Optional: ArtificialBucketThreshold count that will remove the bucket definition if it's below this. Use this as a way to remove artificial bucket definitions that were added 'artificially' in legacy Hive. (default: 2)
-mao, --migrate-acid-only	<bucket-threshold (2)>	Migrate ACID tables ONLY (if strategy allows). Optional: ArtificialBucketThreshold count that will remove the bucket definition if it's below this. Use this as a way to remove artificial bucket definitions that were added 'artificially' in legacy Hive. (default: 2)
-mnn, --migrate-non-native		Migrate Non-Native tables (if strategy allows). These include table definitions that rely on external connections to systems like: HBase, Kafka, JDBC
-mnno, --migrate-non-native-only		Migrate Non-Native tables (if strategy allows). These include table definitions that rely on external connections to systems like: HBase, Kafka, JDBC
-np, --no-purge		For SCHEMA_ONLY, COMMON, and LINKED data strategies set RIGHT table to NOT purge on DROP
-o, --output-dir		Output Directory (default: $HOME/.hms-mirror/reports/<yyyy-MM-dd_HH-mm-ss>)
-p, --password		Used this in conjunction with '-pkey' to generate the encrypted password that you'll add to the configs for the JDBC connections.
-pkey, --password-key		The key used to encrypt / decrypt the cluster jdbc passwords. If not present, the passwords will be processed as is (clear text) from the config file.
-po, --property-overrides	<key=value>	Comma separated key=value pairs of Hive properties you wish to set/override.
-pol, --property-overrides-left	<key=value>	Comma separated key=value pairs of Hive properties you wish to set/override for LEFT cluster.
-por, --property-overrides-right	<key=value>	Comma separated key=value pairs of Hive properties you wish to set/override for RIGHT cluster.
-pt, --pass-through	<key=value>	Key=value property to pass-through to the configuration. This will allow you to set properties that are not part of the HMS-Mirror configuration or part of spring's configuration.
-q, --quiet		Reduce screen reporting output. Good for background processes with output redirects to a file
-rdl, --reset-to-default-location		Strip 'LOCATION' from all target cluster definitions. This will allow the system defaults to take over and define the location of the new datasets.
-replay, --replay		Use to replay process from the report output.
-rid, --right-is-disconnected		Don't attempt to connect to the 'right' cluster and run in this mode
-ro, --read-only		For SCHEMA_ONLY, COMMON, and LINKED data strategies set RIGHT table to NOT purge on DROP. Intended for use with replication distcp strategies and has restrictions about existing DB's on RIGHT and PATH elements. To simply NOT set the purge flag for applicable tables, use -np.
-rr, --reset-right		Use this for testing to remove the database on the RIGHT using CASCADE.
-s, --sync		For SCHEMA_ONLY, COMMON, and LINKED data strategies. Drop and Recreate Schema's when different. Best to use with RO to ensure table/partition drops don't delete data. When used WITHOUT `-tf` it will compare all the tables in a database and sync (bi-directional). Meaning it will DROP tables on the RIGHT that aren't in the LEFT and ADD tables to the RIGHT that are missing. When used with `-ro`, table schemas can be updated by dropping and recreating. When used with `-tf`, only the tables that match the filter (on both sides) will be considered. When used with HYBRID, SQL, and EXPORT_IMPORT data strategies and ACID tables are involved, the tables will be dropped and recreated. The data in this case WILL be dropped and replaced.
-scw, --suppress-cli-warnings		Suppress CLI Warnings from the final on-screen report.
-sdpi, --sort-dynamic-partition-inserts		Used to set `hive.optimize.sort.dynamic.partition` in TEZ for optimal partition inserts. When not specified, will use prescriptive sorting by adding 'DISTRIBUTE BY' to transfer SQL. default: false
-sf, --skip-features		Skip Features evaluation.
-slc, --skip-link-check		Skip Link Check. Use when going between or to Cloud Storage to avoid having to configure hms-mirror with storage credentials and libraries. This does NOT preclude your Hive Server 2 and compute environment from such requirements.
-slt, --skip-legacy-translation		Skip Schema Upgrades and Serde Translations
-smn, --storage-migration-namespace		Optional: Used with the 'data strategy STORAGE_MIGRATION to specify the target namespace.
-sms, --storage-migration-strict		Use 'strict' location translations for storage migration.
-so, --skip-optimizations		Skip any optimizations during data movement, like dynamic sorting or distribute by
-sp, --sql-partition-count		Set the limit of partitions that the SQL strategy will work with. '-1' means no-limit.
-sql, --sql-output		. This option is no longer required to get SQL out in a report. That is the default behavior.
-ssc, --skip-stats-collection		Skip collecting basic FS stats for a table. This WILL affect the optimizer and our ability to determine the best strategy for moving data.
-su, --setup		Setup a default configuration file through a series of questions
-swt, --save-working-tables		Save working tables (shadow tables) created during the migration process.
-tef, --table-exclude-filter		Filter tables (excludes) with name matching RegEx. Comparison done with 'show tables' results. Check case, that's important. Hive tables are generally stored in LOWERCASE. Make sure you double-quote the expression on the commandline.
-tf, --table-filter		Filter tables (inclusive) with name matching RegEx. Comparison done with 'show tables' results. Check case, that's important. Hive tables are generally stored in LOWERCASE. Make sure you double-quote the expression on the commandline.
-tfp, --table-filter-partition-count-limit		Filter partition tables OUT that are have more than specified here. Non Partitioned table aren't filtered.
-tfs, --table-filter-size-limit		Filter tables OUT that are above the indicated size. Expressed in MB
-to, --transfer-ownership		If available (supported) on LEFT cluster, extract and transfer the tables owner to the RIGHT cluster. Note: This will make an 'exta' SQL call on the LEFT cluster to determine the ownership. This won't be supported on CDH 5 and some other legacy Hive platforms. Beware the cost of this extra call for EVERY table, as it may slow down the process for a large volume of tables.
-todb, --transfer-ownership-database		If available (supported) on LEFT cluster, extract and transfer the DB owner to the RIGHT cluster. Note: This will make an 'exta' SQL call on the LEFT cluster to determine the ownership. This won't be supported on CDH 5 and some other legacy Hive platforms.
-totbl, --transfer-ownership-table		If available (supported) on LEFT cluster, extract and transfer the tables owner to the RIGHT cluster. Note: This will make an 'exta' SQL call on the LEFT cluster to determine the ownership. This won't be supported on CDH 5 and some other legacy Hive platforms. Beware the cost of this extra call for EVERY table, as it may slow down the process for a large volume of tables.
-tt, --translation-type		Translation Strategy when migrating data. (ALIGNED\\|RELATIVE) Default is RELATIVE
-v, --views-only		Process VIEWs ONLY
-wd, --warehouse-directory		The warehouse directory path. Should not include the namespace OR the database directory. This will be used to set the MANAGEDLOCATION database option.
-wps, --warehouse-plans	<db=ext-dir:mngd-dir[,db=ext-dir:mngd-dir]...>	The warehouse plans by database. Defines a plan for a database with 'external' and 'managed' directories.

16 May 2025