hms-mirror v2.3.1.x Help

Release Notes

Known Issues

The latest set of known issues can be found here

Enhancement Requests

The latest set of enhancement requests can be found here.

If there is something you'd like to see, add a new issue here

3.0.0.1

This release is based on the 2.3.1.5 release and includes all the features and bug fixes from that release.

This is a Security and CVE release that has upgrading all dependencies to the latest possible versions to eliminate as many of the community CVEs as possible. This also required us to upgrade the minimum JDK version to 17.

What's New

  • JDK 17 Minimum Version Requirement. Addresses dependencies with CVE issues.

Bug Fixes --sync not dropping table on right when left is missing.

2.3.1.5

Bug Fixes

  • Fixed Web UI session status preventing progress.

  • Handle npe from SQL in-place downgrade of ACID tables.

  • Fixed locale issue with set statements that used numeric values.

  • Fixed floorDiv(long,int) to floorDiv(long,long) for Java8 compatibility.

What's New

  • Add support to 'in-place' removal of bucket definitions from an ACID table

Note This will be the last release in the 2.3.x branch with any feature enhancements. Future releases will be in the 3.x branch.

2.3.0.13

What's New

Enhance logging to show which instance is handling a connection/job in case of using multiple HS2 instances

2.3.0.12

Bug Fixes

The Hikari Connection Pool settings are causing intermittent connection failures, cause table transfer failures.

2.3.0.10

Bug Fixes

Second run in WebUI Fails

dbRegEx not being processed. Throws MISC_ERROR because it can't find any databases.

Legacy DBPROPERTIES are causing ERROR when attempting to set on CDP

Issue with jobs not completing when some schema's were already present.

Address lingering connections after run completes.

Fixed counters for CLI screen output.

Fixed an issue with tables being processed multiple times under some conditions.

Partition discovery for SHADOW table when source is a Managed table shouldn't try to build partitions with ALTER

LEFT side SQL when running 'execute' mode for SQL data strategy isn't being run.

CLI App version fails when attempting to set 'concurrency' option

MSCK for Shadow table not generated when 'metastore_direct' on the LEFT isn't defined

2.3.0.4

What's New

BETA Iceberg conversion support for the STORAGE_MIGRATION data strategy. See Iceberg Conversion for more details. To activate this beta feature for the WebUI, add --hms-mirror. config.beta=true to the startup command. EG: hms-mirror --service --hms-mirror.config.beta=true

Bugs (Fixed)

Add to connection init the ability to set the queue AND trigger an engine resource

Validate SQL elements before making changes to the Cluster

Extend the HS2 connection validation with Tez task validation

The reset-to-default-location doesn't seem to be working in v2

2.2.0.19.6 (pre-release)

What's New

BEHAVIOR CHANGE - Drop any shadow table definitions created during migration automatically

2.2.0.19.2 (pre-release)

What's New

Error event is not logged for table skipped because RIGHT already has the table Would be great to be able to omit warnings from the end of logs Be able to reduce spring framework related log entries

2.2.0.19.1

Bug Fixes

What's New

  • Validate JDBC Jar Files in config.

  • Ability to turn-on strict mode for Storage Migration. This will cause distcp to fail when non-standard locations are used. To turn off, use the -sms|--storage-migration-strict flag via the CLI.

Behavior Changes

The default behavior for Storage Migration 'strict' has changed from true to false. The intent behind the strict mode was to ensure distcp would fail when non-standard locations are used. The combination of metastore_direct and knowing the partition location details gives us a better chance on making these mappings work for distcp. When the scenario arises, we do HIGHLY recommend that you validate the plans created. The new default behavior will allow distcp to continue when non-standard locations are encountered, while throwing a warning. This will allow the migration to continue, but you should validate the results.

2.2.0.18.1

What's New

Beta Flag to be used for future beta features. To activate:

hms-mirror --beta
hms-mirror --service --hms-mirror.config.beta=true

Bugs (Fixed)

2.2.0.17.1

What's New

Bugs (Fixed)

2.2.0.15.1

What's New

Feature that allows you to 'skip' modifying the database location during a storage migration. This is useful if you're trying to archive tables in a database to another storage system, but want to leave the database location as is for new tables in the database. For STORAGE_MIGRATION, add option that would skip any Database Location Adjustments

sm_skip_dblocs.png

2.2.0.15

What's New

Bugs (Fixed)

  • Output and Report Directory Consistency between CLI and Web UI. See docs for more details.

  • Postgres Metastore Direct Connection Fixes

  • SQL Data Strategy Validation Blockers for Acid tables-

2.2.0.12

What's New

Bugs (Fixed)

2.2.0.10

What's New

  • [Hive 4 DB OWNER DDL syntax for ALTERing DB ONWER requires 'USER'](https://github. com/cloudera-labs/hms-mirror/issues/139)

This changed resulted in a simplification of how we determine what the cluster platform is. Previously we used two attributes (legacyHive and hdpHive3) to determine the platform. This information would direct logic around translations and other features.

Unfortunately, this isn't enough for us to determine all the scenarios we're encountering. These attributes have been replaced with a new attribute call platformType.

We will make automatic translations of legacy configurations to the new platformType attribute. The translation will be pretty basic and result in either the platform type being defined as HDP2 or CDP_7.1. If you have a more complex configuration, you'll need to adjust the platformType attribute manually. Future persisted configurations will use the new platformType attribute and drop the legacyHive and hdpHive3 attributes.

A feature that was late in making it into the Web UI is now here.

To ensure the right IP stack is used when the Web UI starts up, we're forcing this JDK configuration with the Web UI.

We had a few requests and issues with implementations were the target environment isn't always setup with normal user 'home' standards that we can rely on. This change allows us to set the 'home' directory for the user running the application and ensure its translated correctly in hms-mirror for storing and reading configurations, reports, and logs.

If you are in an environment that doesn't follow user $HOME standards, you can set the HOME environment variable to a custom directory BEFORE starting hms-mirror to alter the default behavior.

Cleanup SQL has been added to Web Reporting UI

We've added a 'Cleanup SQL' tab to the Web Reporting UI. This will show you the SQL that was generated to clean up the source cluster after the migration. This is useful to see what will be done before you execute the migration.

Bugs (Fixed)

2.2.0.9

Bugs (Fixed)

Enhancements

Increase build dependencies to CDP 7.1.9 SP1. Rework Pass Key Management. Additional details in Connection Validation.

2.2.0.8

Bugs (Fixed)

2.2.0.7

Bugs (Fixed)

2.2.0.5

Bugs (Fixed)

2.2.0.4

Bugs (Fixed)

2.2.0.2

This is a big release for hms-mirror. We've added a Web interface to hms-mirror that makes it easier to configure and run varies scenarios.

Along with the Web interface, we've made some significant adjustments to the hms-mirror engine which is much more complete than the previous release. The engine now supports a wider range of strategies and has a more robust configuration system.

We do our best to guide you through configurations that make sense, help you build plans and manage complex scenarios.

Automatic Configuration Adjustments

To ensure that configuration settings are properly set, the application will automatically adjust the configuration settings to match a valid scenario. These changes will be recorded in the 'run status config messages' section and can be seen on reports or the web interface.

Changes are mostly related to the acceptable strategy configurations. See Location Alignment for more details.

Property Overrides

Not yet available in Web UI. Coming soon issue 111.

This feature, introduced in the CLI, allows you to add/override Hive properties on the LEFT, RIGHT, or BOTH clusters for custom control of running Hive jobs. Most commonly used with SQL migration strategies.

Evaluate Partition Locations and Reset to Default Location

These properties are no longer valid. An added property called 'translationType' is used to determine this functionality.

Before the epl|evaluate-partition-locations would gather partition location information from the Metastore Direct connection to ensure they were aligned. We've adjusted/simplified the concept with translationType, which defined either RELATIVE or ALIGNED strategy types.

See Location Alignment for more details.

Concurrency

In previous releases using the CLI, concurrency could be set through the configuration files transfer:concurrency setting. The default was 4 if no setting was provided. This setting in the control file is NO LONGER supported and will be ignored. The new default concurrency setting is 10 and can be overridden only during the application startup.

See Concurrency for more details.

Global Location Maps

Previous releases had a fairly basic implementation of 'Global Location Maps'. These could be supplied through the cli option -glm, which is still supported, but limited in functionality. The improved implementation work from the concept of building 'Warehouse Plans' which are then used to build the 'Global Location Maps'.

See Warehouse Plans for more details.

The -glm option can take an addition element to identify the mapping for a particular table type. As a result, any configuration files save with this setting will not be loaded and will need to be updated.

While the -glm option will still honor the old format of source_dir=target_dir, the new format is source_dir:<table_type>:target_dir. The table_type is a new addition to the configuration and is required for the new implementation. When omitted, the mapping will be created for both EXTERNAL and MANAGED tables.

<table_type> can be one of: EXTERNAL_TABLE or MANAGED_TABLE.

-glm /tpsds_base_dir=EXTERNAL_TABLE:/alt/ext/location

Old Format

globalLocationMap: /tpcds_base_dir: "/alt/ext/location" /tpcds_base_dir2/web: "/alt/ext/location"

New Format

userGlobalLocationMap: /tpcds_base_dir: EXTERNAL_TABLE: "/alt/ext/location" /tpcds_base_dir2/web: EXTERNAL_TABLE: "/alt/ext/location"

JDK 11 Support

The application now supports JDK 11, as well as JDK 8.

Kerberos Support and Platform Libraries

We are still working to replicate the options available in previous release with regard to Kerberos connections. Currently, hms-mirror can only support a single Kerberos connection. This is the same as it was previously. hms-mirror packaging includes the core Hadoop classes required for Kerberos connections pulled from the latest CDP release.

In the past, we 'could' support kerberos connections to lower versions of Hadoop clusters (HDP and CDH) by running hms-mirror on a cluster with those hadoop libraries installed and specifying --hadoop-classpath on the commandline. This is no longer supported, as the packaging required to support the Web and REST interfaces is now different.

We are investigating the possibility of supporting kerberos connections to lower clusters in the future.

Metastore Direct Access

In later 1.6 releases we introduced a 'Metastore Direct' connection type when defining a LEFT(source) cluster. To help build a more complete picture of locations in the metadata, we found it necessary to gather detailed location information for each partition of the datasets being inspected. Because Hive was so configurable regarding location preferences and the ability to set locations at the partition level, we needed to ensure that the locations were aligned. The only sure way to get this complete picture was to connect directly to the Metastore backend database. We currently support 'MYSQL' and 'POSTGRES' metastore backends. 'Oracle' coming soon.

Last modified: 14 April 2025