ALL >> Education >> View Article
What Is Power Bi?
Microsoft Power BI is a data analysis tool that allows users to generate different graphics and reports for data analysis and business intelligence purposes. Data can be imported to Power BI from sources such as:
files,
databases,
web services,
Microsoft Azure services,
3rd party services.
There is two main versions of Power BI: Power BI Desktop and Power BI Cloud.
The Power BI Desktop version has a wide range of connection possibilities. Learn more...
The Power BI Cloud version is limited to files, some Azure-based Database data sources and streaming datasets. Learn more...
Cloud
All
(Browser-based)
FIles (CSV, xls)
Azure Databases
HTTP Rest
Azure Stream Analytics
PubNub
No Windows required.
Easy integration via HTTP API.
Streaming Datasets (limited).
Limited data source selection.
Restrictions on incoming traffic and payload size.
Desktop Windows
FIles (csv, xls)
Databases
Azure Services
3rd Party Services
No
Flexible
Many more data sources than ...
... the Cloud version.
Requires a Windows PC to be run.
No Streaming datasets.
2. Proposed Architecture For Power BI Desktop
2.1 Description
If you store your analysis data on SQL Server, MariaDB, PostgreSQL or similar, Kafka Connect should be an alternative to consider.
Here is how a typical set-up would look like.
Mapp architecture: Public Kafka which contains the topics that represent the Streams of Data Streams (Root / Custom).
This will be configured to consume from Mapp Public Kafka and to write the events into the database of your choice using the JDBC Sink Connector for Kafka Connect (Confluent Community License) in an appropriate table form. Then the data can be read from Power BI Desktop.
2.2 Requirements
At the moment, only Avro Streams can be used for this since they contain schema information.
JSON Streams cannot be used since Mapp is not including schema information on them (read more...). The schema needs to be included by the RDS Replicator or by KSQL.
Both Root and Custom streams are in theory suitable to be consumed by Kafka Connect.
Primitive Types can be selected and should be mapped to the respective column in the corresponding table.
Nested Types (Array, Record, etc.) are supported but they need to be flattened first using a Single Message Transformation.
The usage of custom streams may be easier due to the flattened structure.
If the use case is more complex, implementing a custom converter may be necessary.
Destination table names can be set using table.name.format. However, the ${topic} placeholder cannot be used since the generated SQL conflicts with Mapp's topic names (due to using dots).
The property batch.size in the JDBC Sink Connector can be adjusted accordingly, depending on the account traffic to get the desired update pace in the database.
It is possible to define the Primary Key in the destination table by setting it to the Kafka record key (session ID), to a field of the Kafka record value (e. g. request ID) or nothing at all.
The columns of the destination table can be set using a whitelist of fields in the record.
Latency should be acceptable despite the introduction of Kafka Connect.
The data needs to be transferred over the internet and Kafka Connect needs to write the data into the Database and Power BI needs to read it.
Kafka Connect can be scaled-up by running it in distributed mode and adding multiple servers.
The database can usually be scaled-up vertically.
2.3 Set-up
Kafka user has rights to consume from the Custom Avro Stream.
User has Kerberos ticket for that user (e.g. your own user).
Work is done in a macOS / Linux environment.
MariaDB or MySQL databases are used. If not, a test MariaDB can be set up quickly with Docker.
2.3.1 Setting Up The Stream
Set up the stream via Mapp Intelligence > Data streams.
Create a Custom Avro Stream and project the Page Name.
Example:
2.3.2 Downloading Kafka and Dependencies
Prepare the local setup of Kafka Connect.
Download the latest version of Kafka from their website and unpack the downloaded file.
Include the JDBC Sink Connector (jar) and the Confluent Avro Converter dependencies in the Apache Kafka setup.
Place the JDBC Sink Connector JAR in the Kafka folder under a new subfolder named connect-plugins.
The Confluent Avro Converter ZIP includes also docs and assets. Decompress it and copy the contents of the lib folder into the folder libs of the Kafka setup.
2.3.3 Setting Up The Kafka Connect Configuration
Configure the Kafka Connect Standalone service with a properties file (examples are provided in the conf folder of Kafka).
Name it, for example, connect-standalone.properties and place it in the conf folder of the Kafka setup.
# For customers, this would be our Public Kafka endpoint (stream.webtrekk.com).
bootstrap.servers=public-kafka-wd-int-01.nbg.webtrekk.com:9092
# Incoming Records Have A String Key (SessionID)
key.converter=org.apache.kafka.connect.storage.StringConverter
# Incoming Records Have A Avro Value (Flattened Event)
value.converter=io.confluent.connect.avro.AvroConverter
# For customers, this would be our Public Kafka endpoint (stream.webtrekk.com).
value.converter.schema.registry.url=https://warp-schema-registry-int-01.nbg.webtrekk.com:8080
# There is no Kafka Connect Schema information.
key.converter.schemas.enable=false
value.converter.schemas.enable=false
# Standalone mode stores offsets in a file.
offset.storage.file.filename=/tmp/connect.offsets
offset.flush.interval.ms=10000
# Path to the folder where we located the JDBC Sink plugin.
plugin.path=/path/to/local/kafka_2.12-2.6.0/connect-plugins
# Security Properties For The Admin Client (will be different in Production for customers).
ssl.truststore.location=/path/to/local/nbgtruststore.jks # This is not necessary for our customers.
ssl.truststore.password=password # This is not necessary for our customers.
security.protocol=SASL_SSL
sasl.mechanism=GSSAPI # For customers, this would be SCRAM-SHA-256
# Consumer Group (for customers this is the TrackId).
consumer.group.id=mygroupid
# Security Properties For The Consumer (will be different in Production for customers).
consumer.ssl.truststore.location=/path/to/local/nbgtruststore.jks
consumer.ssl.truststore.password=password
consumer.security.protocol=SASL_SSL
consumer.sasl.mechanism=GSSAPI
2.3.4 Setting Up The JDBC Sink Connector Configuration
Configure the JDBC Sink Connector with another configuration file.
Name it, for example, jdbc-sink-connector.properties and that we will place it under the conf folder of the Kafka setup:
# Connector Name, can be arbitrary.
name=jdbc-sink
# Class of the JDBC Sink Connector
connector.class=io.confluent.connect.jdbc.JdbcSinkConnector
# For testing, 1 Task Max is enough.
tasks.max=1
# Topic Name we will be consuming from.
topics=715326913588444.custom.powerbi.tracking.v1.avro
# JDBC Connection String to the database
connection.url=jdbc:mysql://127.0.0.1:3406/powerbi?serverTimezone=UTC
# User name
connection.user=username
# Password
connection.password=password
# Create table for the data if necessary.
auto.create=true
# Amount of polled records from Kafka to write into the database.
batch.size=1
# Name of the destination table
table.name.format=kafka_powerbi
# Derive PK for the table from the value of the record.
pk.mode=record_value
# Use the field 'ID' as PK.
pk.fields=ID
# List of fields to be extracted from the record (key on record) and that will map to columns in the table.
fields.whitelist=ID,TIMESTAMP,PAGE_NAME
2.3.5 Creating a new Database
Create the database powerbi that we defined in the JDBC connection string in our database. For MariaDB, you can use MySQL Workbench or another tool of your choice.
2.3.6 Setting Up JAAS Configuration File
Prepare the JAAS file, for instance client_jaas.conf, and place it in our Kafka setup folder.
Example (using Kerberos (GSSAPI), you will use SCRAM-SHA-256).
KafkaClient {
com.sun.security.auth.module.Krb5LoginModule required
serviceName=kafka
useTicketCache=true;
};
2.3.7 Running Kafka Connect
Export the environment variable KAFKA_OPTS with the -Djava.security.auth.login.config JVM flag that will indicate where the JAAS file for authentication is located (the one you created in the previous step).
export KAFKA_OPTS="-Djava.security.auth.login.config=/path/to/client_jaas.conf"
2. Get a Kerberos ticket for authenticating. This is usually done using kinit:
kinit user@REALM
# e.g. kinit sergio.rozas@WEBTREKK.COM
3. Run Kafka Connect in Standalone mode (assuming you are located in the Kafka folder and you have placed both configuration files under config):
bin/connect-standalone.sh config/connect-standalone.properties config/jdbc-sink-connector.properties
4. Wait until Kafka Connect Standarlone boots up. Below is an example of what you will see once this is done.
[INFO] Cluster ID: s_UVKbSqSb22543chcau9A (org.apache.kafka.clients.Metadata:279)
[INFO] Discovered group coordinator public-kafka-wd-int-02.nbg.webtrekk.com:9092 (id: 2147483645 rack: null) (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:815)
[INFO] (Re-)joining group (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:553)
[INFO] Join group failed with org.apache.kafka.common.errors.MemberIdRequiredException: The group member needs to have a valid member id before actually entering a consumer group. (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:456)
[INFO] (Re-)joining group (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:553)
[INFO] Finished assignment for group at generation 7: {connector-consumer-jdbc-sink-0-ac9d186e-91a1-4885-a6c0-e437a9b3e118=Assignment(partitions=[715326913588444.custom.powerbi.tracking.v1.avro-0, 715326913588444.custom.powerbi.tracking.v1.avro-1, 715326913588444.custom.powerbi.tracking.v1.avro-2])} (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:627)
[INFO] Successfully joined group with generation 7 (org.apache.kafka.clients.consumer.internals.AbstractCoordinator:504)
[INFO] Notifying assignor about the new Assignment(partitions=[715326913588444.custom.powerbi.tracking.v1.avro-0, 715326913588444.custom.powerbi.tracking.v1.avro-1, 715326913588444.custom.powerbi.tracking.v1.avro-2]) (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:276)
[INFO] Adding newly assigned partitions: 715326913588444.custom.powerbi.tracking.v1.avro-0, 715326913588444.custom.powerbi.tracking.v1.avro-2, 715326913588444.custom.powerbi.tracking.v1.avro-1 (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:288)
[INFO] Setting offset for partition 715326913588444.custom.powerbi.tracking.v1.avro-0 to the committed offset FetchPosition{offset=8, offsetEpoch=Optional.empty, currentLeader=LeaderAndEpoch{leader=Optional[public-kafka-wd-int-03.nbg.webtrekk.com:9092 (id: 3 rack: null)], epoch=0}} (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:815)
[INFO] Setting offset for partition 715326913588444.custom.powerbi.tracking.v1.avro-2 to the committed offset FetchPosition{offset=0, offsetEpoch=Optional.empty, currentLeader=LeaderAndEpoch{leader=Optional[public-kafka-wd-int-02.nbg.webtrekk.com:9092 (id: 2 rack: null)], epoch=0}} (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:815)
[INFO] Setting offset for partition 715326913588444.custom.powerbi.tracking.v1.avro-1 to the committed offset FetchPosition{offset=0, offsetEpoch=Optional.empty, currentLeader=LeaderAndEpoch{leader=Optional[public-kafka-wd-int-01.nbg.webtrekk.com:9092 (id: 1 rack: null)], epoch=0}} (org.apache.kafka.clients.consumer.internals.ConsumerCoordinator:815)
2.3.8 Testing Kafka Connect
Send a dummy requests with a page name (since you enabled that field in the stream projections) to your Tracker.
curl "http://tracker-int-01.nbg.webtrekk.com:10001/715326913588444/wt\?p\=300,testTrafficXXXX"
2. Use XXXX to set some time to track our requests in the result.
3. Check if you see the requests in Kafka.
Add Comment
Education Articles
1. This School In Dubai: A Pathway To Excellence And Global CitizenshipAuthor: mansurali2343
2. Mbbs In Russia: A Comprehensive Guide
Author: Mbbs Blog
3. Mbbs In Russia: A Comprehensive Guide Winter Intake
Author: Mbbs Blog
4. Mbbs In Romania: A Pathway To Successful Mbbs Career
Author: Mbbs Blog
5. Study In Europe Without Ielts | 9810264496 | Your Ultimate Guide By Study Metro Delhi
Author: Study Metro Delhi
6. Manual+selenium Testing Training Institute
Author: Himanshu
7. Achieve Career Success With Enrolled Agent Certification At Ilead Tax Academy
Author: Ileadtax Academy
8. Why China Is A Top Destination For Indian Students Pursuing Mbbs Abroad
Author: Mbbs Blog
9. Power Bi Training | Power Bi Training In Hyderabad
Author: Anika Sharma
10. Azure Data Engineer Course Online | Azure Data Engineer
Author: Eshwar
11. संघर्ष से प्रेरणा तक: संदीप भंसाली की डिजिटल क्रांति की कहानी
Author: Digital Azadi
12. Usa Visa Consultant In Warangal: Your Trusted Partner For Overseas Dreams
Author: Johnwick
13. Why Mbbs In Belarus Is The Perfect Choice For Indian Medical Aspirants
Author: Mbbs Blog
14. Photography And Mental Health
Author: Neeraja
15. Devops Course | Devops Training In Hyderabad
Author: visualpath