+ All Categories
Home > Technology > Kafka and Avro with Confluent Schema Registry

Kafka and Avro with Confluent Schema Registry

Date post: 21-Jan-2018
Category:
Upload: jean-paul-azar
View: 21,624 times
Download: 0 times
Share this document with a friend
29
Cassandra / Kafka Support in EC2/AWS. Kafka Training , Kafka Consulting Avro Kafka & Avro: Confluent Schema Registry Managing Record Schema in Kafka
Transcript
Page 1: Kafka and Avro with Confluent Schema Registry

Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka

Consulting

Avro

Kafka & Avro:Confluent Schema Registry

Managing Record Schema in

Kafka

Page 2: Kafka and Avro with Confluent Schema Registry

Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka

Consulting™

Confluent Schema Registry

❖ Confluent Schema Registry stores Avro Schemas for Kafka

clients

❖ Provides REST interface for putting and getting Avro schemas

❖ Stores a history of schemas

❖ versioned

❖ allows you to configure compatibility setting

❖ supports evolution of schemas

❖ Provides serializers used by Kafka clients which handles schema

storage and serialization of records using Avro

Page 3: Kafka and Avro with Confluent Schema Registry

Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka

Consulting™

Why Schema Registry?

❖ Producer creates a record/message, which is an Avro record

❖ Record contains the schema and data

❖ Schema Registry Avro Serializer serializes the data and schema id (just id)

❖ Keeps a cache of registered schemas from Schema Registry to ids

❖ Consumer receives payload and deserializes it with Schema Registry Avro Deserializers

❖ Deserializer looks up the full schema from cache or Schema Registry based on id

❖ Consumer has its schema, one it is expecting record/message to conform to

❖ Compatibility check is performed or two schemas

❖ if no match, but are compatible, then payload transformation happens aka Schema Evolution

❖ if not failure

❖ Kafka records have Key and Value and schema can be done on both

Page 4: Kafka and Avro with Confluent Schema Registry

Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka

Consulting™

Schema Compatibility

❖ Backward Compatibility (default)

❖ New, backward compatible schema, will not break consumers

❖ Producers could be using older schema that is backwards compatible with Consumer

❖ Forward compatibility

❖ Records sent with new forward compatible schema can be deserialized with older schemas

❖ Consumers can use an older schema and never be updated (maybe never needs new fields)

❖ Full compatibility

❖ New version of a schema is backward and forward compatible

❖ None

❖ Schema will not be validated for compatibility at all

Page 5: Kafka and Avro with Confluent Schema Registry

Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka

Consulting™

Schema Registry Config

❖ Compatibility can be configured globally or per schema

❖ Options are:

❖ NONE - don’t check for schema compatibility

❖ FORWARD - check to make sure last schema version is forward

compatible with new schemas

❖ BACKWARDS (default) - make sure new schema is backwards

compatible with latest

❖ FULL - make sure new schema is forwards and backwards

compatible from latest to new and from new to latest

Page 6: Kafka and Avro with Confluent Schema Registry

Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka

Consulting™

Schema Registry Actions

❖ Register schemas for key and values of Kafka records

❖ List schemas (subjects)

❖ List all versions of a subject (schema)

❖ Retrieve a schema by version or id

❖ get latest version of schema

❖ Check to see if schema is compatible with a certain version

❖ Get the compatibility level setting of the Schema Registry

❖ BACKWARDS, NONE

❖ Add compatibility settings to a subject/schema

Page 7: Kafka and Avro with Confluent Schema Registry

Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka

Consulting™

Schema Evolution

❖ Avro schema is changed after data has been written to store using an older version of

that schema, then Avro might do a Schema Evolution

❖ Schema evolution is automatic transformation of Avro schema

❖ transformation is between version of consumer schema and what the producer put

into the Kafka log

❖ When Consumer schema is not identical to the Producer schema used to serialize

the Kafka Record then a data transformation is performed on the Kafka record (key or

value)

❖ If the schemas match then no need to do a transformation

❖ Schema evolution is happens only during deserialization at the Consumer

❖ If Consumer’s schema is different from Producer’s schema, then value or key is

automatically modified during deserialization to conform to consumers reader schema

Page 8: Kafka and Avro with Confluent Schema Registry

Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka

Consulting™

Allowed Schema Modifications

❖ Add a field with a default

❖ Remove a field that had a default value

❖ Change a fields order attribute

❖ Change a fields default value

❖ Remove or add a field alias

❖ Remove or add a type alias

❖ Change a type to a union that contains original type

Page 9: Kafka and Avro with Confluent Schema Registry

Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka

Consulting™

Rules of the Road for modifying Schema

❖ Provide a default value for fields in your schema

❖ Allows you to delete the field later later

❖ Don’t change a field's data type

❖ When adding a new field to your schema, you have to

provide a default value for the field

❖ Don’t rename an existing field

❖ You can add an alias

Page 10: Kafka and Avro with Confluent Schema Registry

Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka

Consulting™

Remember our example Employee

Avro covered in Avro/Kafka Tutorial

Page 11: Kafka and Avro with Confluent Schema Registry

Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka

Consulting™

Let’s say

❖ Employee did not have an age in version 1 of the

schema

❖ Later we decided to add an age field with a default value

of -1

❖ Now let’s say we have a Producer using version 2, and

a Consumer using version 1

Page 12: Kafka and Avro with Confluent Schema Registry

Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka

Consulting™

Scenario adding a new field age with default value

❖ Producer uses version 2 of the Employee schema and creates a

com.cloudurable.Employee record, and sets age field to 42, then sends it to Kafka topic

new-employees

❖ Consumer consumes records from new-employees using version 1 of the Employee

Schema

❖ Since Consumer is using version 1 of schema, age field is removed during

deserialization

❖ Same consumer modifies name field and then writes the record back to a NoSQL store

❖ When it does this, the age field is missing from value that it writes to the store

❖ Another client using version 2 reads the record from the NoSQL store

❖ Age field is missing from the record (because the Consumer wrote it with version 1),

age is set to default value of -1

Page 13: Kafka and Avro with Confluent Schema Registry

Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka

Consulting™

Schema Registry Actions

❖ Register schemas for key and values of Kafka records

❖ List schemas (subjects)

❖ List all versions of a subject (schema)

❖ Retrieve a schema by version or id

❖ get latest version of schema

❖ Check to see if schema is compatible with a certain version

❖ Get the compatibility level setting of the Schema Registry

❖ BACKWARDS, FORWARD, FULL, NONE

❖ Add compatibility settings to a subject/schema

Page 14: Kafka and Avro with Confluent Schema Registry

Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka

Consulting™

Register a Schema

Page 15: Kafka and Avro with Confluent Schema Registry

Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka

Consulting™

Register a Schema

{"id":2}

curl -X POST -H "Content-Type: application/vnd.schemaregistry.v1+json" \

--data '{"schema": "{\"type\": …}’ \

http://localhost:8081/subjects/Employee/versions

Page 16: Kafka and Avro with Confluent Schema Registry

Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka

Consulting™

List All Schema

["Employee","Employee2","FooBar"]

curl -X GET http://localhost:8081/subjects

Page 17: Kafka and Avro with Confluent Schema Registry

Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka

Consulting™

Working with versions

[1,2,3,4,5]

{“subject”:"Employee","version":2,"id":4,"schema":"

{\"type\":\"record\",\"name\":\"Employee\",

\”namespace\”:\"com.cloudurable.phonebook\", …

{“subject”:"Employee","version":1,"id":3,"schema":"

{\"type\":\"record\",\"name\":\"Employee\",

\”namespace\”:\"com.cloudurable.phonebook\", …

Page 18: Kafka and Avro with Confluent Schema Registry

Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka

Consulting™

Working with Schemas

Page 19: Kafka and Avro with Confluent Schema Registry

Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka

Consulting™

Changing Compatibility Checks

Page 20: Kafka and Avro with Confluent Schema Registry

Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka

Consulting™

Incompatible Change

{“error_code":409,"

message":"Schema being registered is incompatible with an earlier schema"}

Page 21: Kafka and Avro with Confluent Schema Registry

Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka

Consulting™

Incompatible Change

{"is_compatible":false}

Page 22: Kafka and Avro with Confluent Schema Registry

Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka

Consulting™

Use Schema Registry

❖ Start up Schema Registry server pointing to Zookeeper

cluster

❖ Import Kafka Avro Serializer and Avro Jars

❖ Configure Producer to use Schema Registry

❖ Use KafkaAvroSerializer from Producer

❖ Configure Consumer to use Schema Registry

❖ Use KafkaAvroDeserializer from Consumer

Page 23: Kafka and Avro with Confluent Schema Registry

Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka

Consulting™

Start up Schema Registry Server

cat ~/tools/confluent-3.2.1/etc/schema-registry/schema-registry.properties

listeners=http://0.0.0.0:8081

kafkastore.connection.url=localhost:2181

kafkastore.topic=_schemas

debug=false

Page 24: Kafka and Avro with Confluent Schema Registry

Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka

Consulting™

Import Kafka Avro Serializer & Avro Jars

Page 25: Kafka and Avro with Confluent Schema Registry

Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka

Consulting™

Configure Producer to use Schema Registry

Page 26: Kafka and Avro with Confluent Schema Registry

Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka

Consulting™

Use KafkaAvroSerializer from Producer

Page 27: Kafka and Avro with Confluent Schema Registry

Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka

Consulting™

Configure Consumer to use Schema Registry

Page 28: Kafka and Avro with Confluent Schema Registry

Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka

Consulting™

Use KafkaAvroDeserializer from Consumer

Page 29: Kafka and Avro with Confluent Schema Registry

Cassandra / Kafka Support in EC2/AWS. Kafka Training, Kafka

Consulting™

Schema Registry

❖ Confluent provides Schema Registry to manage Avro

Schemas for Kafka Consumers and Producers

❖ Avro provides Schema Migration

❖ Confluent uses Schema compatibility checks to see if

Producer schema and Consumer schemas are

compatible and to do Schema evolution if needed

❖ Use KafkaAvroSerializer from Producer

❖ Use KafkaAvroDeserializer from Consumer


Recommended