Aplicaciones de Big Data en Turismo

October 31, 2016, 7:51 am

≫ Next: Analysis Big Data OLAP sobre Hadoop con Apache Kylin

Interesante estudio el que presentan nuestros amigos de Territorio Creativo, donde se hace un buen repaso a las aplicaciones del Big Data en el ámbito del Turismo

Por nuestro lado, os dejamos algunos ejemplos de aplicación en Turismo y demostraciones Big Data, aplicables a diferentes áreas

↧

Analysis Big Data OLAP sobre Hadoop con Apache Kylin

November 2, 2016, 7:47 am

≫ Next: Open Source Business Intelligence tips and tricks in October 16

≪ Previous: Aplicaciones de Big Data en Turismo

Caso de estudio que presentamos, en el que hacemos uso de las herramientas Apache Kylin y STPivot para dar soporte al análisis interactivo OLAP de un almacén de datos (Data Warehouse, DW) que contiene datos con características Big Data (Volumen, Velocidad y Variedad)

Se trata de un gran Volumen de datos académicos, relativos a los últimos 15 años de una universidad de gran tamaño. A partir de esta fuente de datos, se ha diseñado un modelo multidimensional para el análisis del rendimiento académico. En él contamos con unos 100 millones de filas con medidas cómo los créditos relativos a asignaturas aprobadas, suspendidas o matriculadas. Estos hechos se analizan en base a distintas dimensiones o contextos de análisis, como el Sexo o la Calificación y la siempre presente componente temporal, el Año Académico.

Dado que este Volumen de datos es demasiado grande para analizarlo con un rendimiento aceptable con los sistemas OLAP (R-OLAP y M-OLAP) tradicionales, hemos decidido probar la tecnología Apache Kylin, la cual promete tiempos de respuesta en consultas de unos pocos segundos para Volúmenes superiores a los 10 billones de filas.

Las tecnologías del entorno Hadoop fundamentales para Kylin son Apache Hive y Apache HBase.
El almacén de datos (Data Warehouse, DW) se crea en forma de modelo estrella y se mantiene en Apache Hive.

A partir de este modelo y mediante la definición de un modelo de metadatos del cubo OLAP, Apache Kylin, mediante un proceso offline crea un cubo multidimensional (MOLAP) en HBase.

Ver Big Data-OLAP en funcionamiento

A partir de este momento, Kylin permite hacer consultas sobre el mismo a través de su interfaz SQL, también accesible a través de conectores J/ODBC.

Por último, para hacer posible el análisis OLAP mediante consultas MDX y las tablas o vistas multidimensionales correspondientes, hacemos uso de la herramienta STPivot.

STPivot es un visor OLAP desarrollado por StrateBI como parte de la suite STTools.
STPivot usa Mondrian como servidor OLAP y puede desplegarse sobre un servidor BI como Pentaho BA Server, ambos open source. De esta forma, STPivot permite crear y explorar vistas o tablas multidimensionales, cómo las de esta demo, que hacen uso del cubo OLAP creado con Apache Kylin.

Desarrollada por eBay y posteriormente liberada como proyecto Apache open source, Kylin es una herramienta de código libre que da soporte al procesamiento analítico en línea (OLAP) de grandes volúmenes de datos con las características del Big Data (Volumen, Velocidad y Variedad).

Sin embargo, hasta la llegada de Kylin, la tecnología OLAP estaba limitada a las bases de datos relacionales o, en el mejor de los casos, con optimizaciones para el almacenamiento multidimensional, tecnologías con importantes limitaciones para enfrentarse al Big Data.

Apache Kylin, construida sobre la base de distintas tecnologías del entorno Hadoop, proporciona una interfaz SQL que permite la realización de consultas para el análisis multidimensional de un conjunto de datos, logrando unos tiempos de consulta muy bajos (segundos) para hechos de estudio que pueden llegar hasta los 10 billones de filas o más.
Las tecnologías del entorno Hadoop fundamentales para Kylin son Apache Hive y Apache HBase.

El almacén de datos (Data Warehouse, DW) se crea en forma de modelo estrella y se mantiene en Apache Hive. A partir de este modelo y mediante la definición de un modelo de metadatos del cubo OLAP, Apache Kylin, mediante un proceso offline, crea un cubo multidimensional (MOLAP) en HBase. Se trata de una estructura optimizada para su consulta a través de la interfaz SQL proporcionada por Kylin.

De esta forma cuando Kylin recibe una consulta SQL, debe decidir si puede responderla con el cubo MOLAP en HBase (en milisegundos o segundos), o sí por el contrario, no se ha incluido en el cubo MOLAP, y se ha ejecutar una consulta frente al esquema estrella en Apache Hive (minutos), lo cual es poco frecuente.

Por último, gracias al uso de SQL y la disponibilidad de drivers J/ODBC podemos conectar con herramientas de Business Intelligence como Tableau, Apache Zeppelin o incluso motores de consultas MDX como Pentaho Mondrian, permitiendo el análisis multidimensional en sus formas habituales: vistas o tablas multidimensionales, cuadros de mando o informes.

Ver Big Data-Dashboard en funcionamiento

STPivot es un visor OLAP potente a la par que fácil de usar, desarrollado por StrateBI y que forma parte de la suite de aplicaciones Business Intelligence, STTools.

El objetivo de este visor es mejorar la experiencia de usuario haciendo tan sencillo el análisis OLAP como arrastrar y soltar las medidas y contextos del análisis en un lienzo, de forma que la vista OLAP se genera de forma transparente al usuario.

Además, la incorporación de asistentes de consulta, gráficos novedosos además de las propias tablas multidimensionales, un editor de fórmulas avanzado o la exportación para la publicación de las vistas en distintos formatos, son algunas de las características más destacadas de STPivot y que diferencian nuestra herramienta de otros visores OLAP existentes.

En cuanto a su arquitectura, STPivot funciona sobre el motor de ejecución MDX, Mondrian.
Es por ello, qué STPivot puede usarse como aplicación del servidor de Business Intelligence open source Pentaho BA Server (CE), el cual ya incluye Mondrian como parte del mismo.

Gracias a la conectividad JDBC es posible la conexión de Mondrian con Apache Kylin y, de esta forma, el uso de esta fuente de datos OLAP y Big Data con STPivot.

Como fuente datos Big Data de esta demo, disponemos de un gran Volumen de datos académicos ficticios, relativos a los últimos 15 años de una universidad de gran tamaño y por la que han pasado más de un millón de alumnos en este tiempo. A partir de esta fuente de datos, se ha diseñado un modelo multidimensional para el análisis del rendimiento académico

En él contamos con unos 100 millones de filas con medidas cómo la suma de los créditos relativos a asignaturas aprobadas, suspendidas o matriculadas.

Además también nos encontramos con otras medidas derivadas de las anteriores y, por tanto, más complejas como son la Tasa de rendimiento y Tasa de éxito, calculadas a partir de la relación entre Créditos Superados y Créditos Matriculados y de la relación entre Créditos Superados y Créditos Presentados.

No menos importantes son las dimensiones o contextos de análisis en base a los que se analizan las medidas anteriores. Como dimensiones de un solo nivel tenemos el Sexo, la Calificación, el Rango de Edad y la siempre presente componente temporal, el Año Académico. Además, hemos incorporado dos dimensiones complejas, con jerarquías de dos niveles y una mayor cardinalidad, siendo frecuente encontrarnos con dimensiones de esta naturaleza.

Con la dimensión Estudio, podemos analizar los datos agrupados al nivel de Tipo de Estudio (Grado, Máster, Doctorado,...) o profundizar (operación Drill Down sobre la vista OLAP) hasta los distintos Planes de Estudio, esto es, las distintas titulaciones, como "315-Grado en Biología".

↧

Open Source Business Intelligence tips and tricks in October 16

November 4, 2016, 2:02 am

≫ Next: List of Open Source Busines Intelligence tools

≪ Previous: Analysis Big Data OLAP sobre Hadoop con Apache Kylin

Now you can check latest tips on Business Intelligence Open Source, mainly Pentaho, Ctools and Saiku in October. You can see some of this tips implemented in Demo Online.

This month with great stuff:

- https://www.panorama.com/blog/history-business-intelligence/

- http://pedroalves-bi.blogspot.com.es/2016/09/ctools-iot-smart-cities-and-more.html

- https://www.youtube.com/watch?v=WgoPYx21xYU&app=desktop

- http://todobi.blogspot.com.es/2016/09/location-intelligence-bringing-together.html

- https://github.com/mbostock/shapefile/blob/master/README.md

- http://todobi.blogspot.com.es/2016/11/analysis-big-data-olap-sobre-hadoop-con.html

- http://www.lewisgavin.co.uk/CDE-Dashboard/

- http://rpbouman.blogspot.com.es/2016/05/odxl-generic-data-export-layer-for.html

- http://pedroalves-bi.blogspot.com.es/2016/10/pentaho-7.0.html

- https://github.com/kleysonr/NMC-samples

- http://diethardsteiner.github.io/flink/2016/09/18/Flink-Twitter-Stream.html

- http://ubiquis.co.uk/dwh/status-change-fact-table-part-1-the-problem/

- http://todobi.blogspot.com.es/2016/10/list-of-open-source-solutions-for-smart.html

- https://redcloverbi.wordpress.com/2016/10/21/backup-y-restore-en-pentaho-de-forma-facil/

- https://github.com/bhagyas/awesome-alfresco

- http://todobi.blogspot.com.es/2016/10/twitter-real-time-dashboard.html

- http://ubiquis.co.uk/dwh/status-change-fact-table-part-2-the-input-data/

- http://ubiquis.co.uk/dwh/status-change-fact-table-part-4-a-pdi-implementation/

- https://github.com/jazzido/mondrian-rest

↧

List of Open Source Busines Intelligence tools

November 5, 2016, 10:02 am

≫ Next: OLAP for Big Data. It´s possible?

≪ Previous: Open Source Business Intelligence tips and tricks in October 16

Here you can find an updated list of main business intelligence open source tools. If you know any other, don´t hesitate to write us

- Talend, including ETL, Data quality and MDM. Versions OS y Enterprise

- Pentaho, including Kettle, Mondrian, JFreeReport and Weka. Versions OS y Enterprise

- BIRT, for reporting

- Seal Report, for reporting

- LinceBI, including Kettle, Mondrian, STDashboard, STCard and STPivot

- Jasper Reports, including iReport. Versions OS y Enterprise

- Jedox Base, Palo core and Jedox Base. Versions OS y Enterprise

- Saiku, for OLAP Analysis. Versions OS y Enterprise

- SpagoBI, including Talend, Mondrian, JPivot and Palo

- Knime, including Knime connectors

- Kibana, for elasticsearch data

↧

OLAP for Big Data. It´s possible?

November 10, 2016, 8:03 am

≫ Next: Pentaho 7 CE ya listo para descargar

≪ Previous: List of Open Source Busines Intelligence tools

Hadoop is a great platform for storing a lot of data, but running OLAP is usually done on smaller datasets in legacy and traditional proprietary platforms. OLAP workloads are beginning to migrate to the one data lake that is running Hadoop and Spark.

Fortunately, there are a number of Apache projects that are starting to make OLAP possible on Hadoop.

Apache Kylin

For an introduction to this interesting Hadoop project, check out this article. Apache Kylin originally from eBay, is a Distributed Analytics Engine that provides SQL and OLAP access to Hadoop datasets utilizing Hive and HBase. It can use called through SparkSQL as well making for a very useful project. This project let's you work with PowerBI, Tableau and Excel with more tool support coming soon. You can do MOLAP cubes and support many users with fast queries over billions of rows. Apache Kylin provides JDBC and ODBC drivers.

Check our Post with demo online and detailed information

An interesting talk on Mondrian, MDX and Apache Kylin, points to big things in OLAP. Yet another project using the excellent Apache Calcite.

I would recommend giving this project a try and see if it meets your needs. It is one of the best options out there. It is currently not part of the Big Hadoop Three's supported stacks.

Druid

Druid is another very strong offering in fast SQL OLAP solutions on Hadoop with support growing rapidly. The documentation for this project is excellent and makes it easy for OLAP-oriented DBAs, data architects, data engineers and data focused programmers to get started with this interesting Big Data project. Druid provides sub-second OLAP Queries with column orientation and inverted indexes enabling multi-dimensional filtering and scanning to allow for aggregating and filtering data. Again, not officially part of the Big Hadoop Three's supported stacks. I recommend downloading and installing this project and giving it a test run. Airbnb and Alibaba are users of Druid.

And the secret word for Druid; Apache Calcite. This project seems to be everywhere and you will find it here as well.

Apache Lens

Apache Lens provides a unified analytics interface to Hadoop. It is pretty quick to install, works with Hive, JDBC and OLAP Cubes. There is an Apache Zeppelin interface for Apache Lens which is good. I don't hear a lot about this one, but again it seems interesting.

Other Options To Investigate:

SnappyData (Strong SQL, In-Memory Speed, and GemfireXD history)
Apache HAWQ (Strong SQL support and Greenplum history)
Splice Machine (Now Open Source)
Hive LLAP is moving into OLAP, SQL 2011 support is growing and so is performance.
Apache Phoenix may be able to do basic OLAP with some help from Saiku or STPivot. I really like Phoenix and it has the performance and power to back up a lot of data through queries and concurrency. It is lacking a lot of the OLAP specific queries that some tools and users will most likely need. I am thinking that Apache Calcite and Phoenix will eventually make this a great OLAP tools.

Source: Dzone

↧

Pentaho 7 CE ya listo para descargar

November 12, 2016, 1:07 am

≫ Next: Como empezar a aprender Big Data en 2 horas

≪ Previous: OLAP for Big Data. It´s possible?

Ya tenéis disponible la versión 7 de Pentaho Open Source, tanto de BI Server, como de PDI (Pentaho Data Integration)

A disfrutar!!

Si necesitas apoyo para una migración de versiones anteriores, echa un vistazo a este post

En este blog, puedes seguir lo contado en cada una de las charlas, más que interesantes, que se contaron en el Pentaho Community Meeting de Amberes (PCM16)

Una de las funcionalidades más interesantes presentadas es:

WebSpoon

A web browser based version of Spoon. WebSpoon is basically Spoon that runs in your brower, easy as that.

By accessing a server URL, you can create, preview, save and run transformations and jobs in your browser. WebSpoon works on server side, so all your transformations are stored and run in your browser.

If you want to deploy webspoon for yourself, you can download the .war file from the repository, copy it to the tomcat webserver folder and restart your server. After doing so, webspoon will be accessible through the url of your running server.

Different usecases can be thought of for a browser based spoon:

PDI on the go: run pdi on your smartphone or tablet.
Security: transformations and jobs run on the server so the data remains within the server.
No installation required.
No difference in UI between BI server and DI server.

In order to get developing yourself and contribute to the project, clone the repository, install RAP and eclipse and import the cloned UI folder as an eclipse project.

A disfu

↧

Como empezar a aprender Big Data en 2 horas

November 17, 2016, 6:45 am

≫ Next: Cuadros de Mando y Business Intelligence para Ciudades Inteligentes

≪ Previous: Pentaho 7 CE ya listo para descargar

Big Data es uno de los hitos de estos últimos años. Son muchas las personas que quieren acercarse y conocer, primero lo más básico, para tener unas nociones generales. Pero resulta complicado encontrar una rápida guía, que en un par de horas, sirva para 'defendernos' en esto del Big Data, máxime si no se tienen altos skills técnicos

Por ello, hemos recopilado una serie de infografías, presentaciones, webinar, demos y documentación para que podáis tener una primera visión del Big Data en 2 horas!!

1. Infografías

2. Webinar

Ver en formato Presentación

3. Demos

Ver Demos Online

4. Claves-Presentaciones

5. Libro Verde del Big Data

Mas info? Escríbenos

↧

Cuadros de Mando y Business Intelligence para Ciudades Inteligentes

November 17, 2016, 8:17 am

≫ Next: Tipos de roles en Analytics (Business Intelligence, Big Data)

≪ Previous: Como empezar a aprender Big Data en 2 horas

Cada vez son más las ciudades que están implementando soluciones de Ciudades Inteligentes, Smart Cities... en donde se abarcan una gran cantidad de aspectos, en cuando a tecnologías, dispositivos, analítica de datos, etc...

Lo principal en todos ellos es que son soluciones que deben integrar información e indicadores diversos de todo tipo de fuentes de datos: bases de datos relacionales tradicionales, redes sociales, aplicaciones móviles, sensores... en donde es fundamental que no haya islas o tecnologías cerradas, por lo que el Open Source es fundamental, pues se puede adaptar a todo tipo de soluciones

En base a nuestra experiencia en algunos de estos proyectos de ciudades inteligentes en los que hemos participado, queremos compartir unos cuantas tecnologías, recursos y demos que os pueden ser de ayuda:

1. List of Open Source solutions for Smart Cities - Internet of Things projects

2. List of Open Source Business Intelligence tool for Smart Cities

3. 35 Open Source Tools para Internet of Things (IoT)

Demos:

Tecnologías Big Data

Demos Business Intelligence

Seguimiento del tráfico near real time en el Ayuntamiento de Madrid (Acceso)

Geoposicionamiento de rutas dinámicas (Acceso/Video)

Recomendación de Rutas (grafos) (Acceso/Video)

↧

Tipos de roles en Analytics (Business Intelligence, Big Data)

November 20, 2016, 9:01 am

≫ Next: Business Intelligence for Hadoop Benchmark

≪ Previous: Cuadros de Mando y Business Intelligence para Ciudades Inteligentes

Conforme va creciendo la industria de Analytics, se hace más dificil conocer las descripción de cada uno de los roles y puestos. Es más, generalmente se usan de forma equivocada, mezclando tareas, descripciones de cometidos, etc...

Esto lleva a confusión tanto a los propios especialistas, como a las personas que están formandose y estudiando para realizar estos trabajos. En una industria tan cambiante es frecuente la aparición y especialización de diferentes puestos de trabajos. Aquí, os detallamos cada uno de ellos:

Business Analyst:

Data Analyst:

Data and Analytics Manager:

Data Architect:

Data Engineer:

Data Scientist:

Database Administrator:

Statistician:

Te puede interesar tambien:

Como pasar una entrevista con Pentaho BI Open Source?
Skills en Data Analysts y sus diferencias
Empezar a aprender Big Data en 2 horas?

Visto en Kdnuggets

↧

Business Intelligence for Hadoop Benchmark

November 25, 2016, 10:45 am

≫ Next: Lanzamiento de Jedox 7 y Novedades

≪ Previous: Tipos de roles en Analytics (Business Intelligence, Big Data)

Quite interested this Benchmark you can download from atscale, where you can find insights about Business Intelligence on Hadoop

If you are interested, check also our posts:

- OLAP for Big Data. It´s possible?
- List of Open Source Business Intelligence tools
- Analysis Big Data OLAP sobre Hadoop con Apache Kylin (spanish)
- Caso de uso de Apache Kafka en tiempo real, Big Data (spanish)

About the Benchmark:

Key Findings:

SQL-on-Hadoop engines are well suited for Business Intelligence (BI): All tested engines – Hive, Impala, Presto,and Spark SQL – successfully executed all of the queries in our benchmark suite and are stable enough to support business intelligence workloads.

There is no single “best engine”: We continue to see the different engines shine in different areas. Depending on raw data size, query complexity, and the target number of end-users enterprises will find that each engine has its own ‘sweet spot’.

Version-to-version improvements are significant: The open source community continues to drive significant and rapid improvements across the board. All engines tested showed between 2x to 4x performance gains in the six months between the first and second edition of the benchmarks. This is great news for those enterprises deploying BI workloads to Hadoop.

Small vs. Big Data: Impala and Spark SQL continue to shine for small data queries (queries against the AtScale Adaptive Cache). New in this edition, the latest release of Hive LLAP (Live Long and Process) shows suitable “small data” query response times. Presto also shows promise on small, interactive queries.

Few vs. Many Users: While Impala continues to shine in terms of concurrent query performance, Hive and SparkSQL showed improvements in this category. Presto, new to this edition of the benchmarks, showed the best results in our user concurrency testing.

↧

Lanzamiento de Jedox 7 y Novedades

December 1, 2016, 11:20 am

≫ Next: 7 Ejemplos y Aplicaciones practicas de Big Data

≪ Previous: Business Intelligence for Hadoop Benchmark

Se acaba de presentar la versión 7 de una de las mejores soluciones para Planificación y Presupuestación Financiera y de Ventas, Jedox 7

Apúntate al webinar gratuito en español para el próximo 13 de Diciembre de 15:30h a 17:30h

A continuación, te contamos las novedades, mejoras, etc... En este enlace tienes otros posts que hemos publicado sobre Jedox

Press Release oficial sobre el lanzamiento

Jedox 7:

- Web en inglés con las novedades en Jedox 7

Jedox 7 is a true game-changer: Download our free "What's New" whitepaper and get all the details on smart modeling tools that bring your planning quickly up to speed, new design capabilities, enhancements to our innovative GPU technology, and so much more.

Download Whitepaper

Jedox Models: Planning Made Simple

We are proud to introduce four all-new Jedox Models for Profit & Loss, Cost Center, Sales and Human Resources.

In 2017, Jedox and their partners will continue to provide a growing portfolio of these predefined and configurable planning applications through the new Jedox Marketplace.

Discover how you can kickstart and improve your planning processes with our new Jedox Models.

Jedox Models

↧

7 Ejemplos y Aplicaciones practicas de Big Data

December 6, 2016, 8:59 am

≫ Next: Available new Open Source OLAP viewer, STPivot4

≪ Previous: Lanzamiento de Jedox 7 y Novedades

En las siguientes Aplicaciones, Cuadros de Mando y ejemplos podéis ver el funcionamiento práctico del Big Data en diferentes casos y usando diferentes tecnologías: Kafka, Spark, Apache Kylin, Neo4J....

Acceder a los ejemplos

Si quieres saber más de Big Data, te pueden interesar estos enlaces:

- OLAP for Big Data. It´s possible?
- Como empezar a aprender Big Data en 2 horas
- List of Open Source Business Intelligence tools
- Analysis Big Data OLAP sobre Hadoop con Apache Kylin (spanish)
- Caso de uso de Apache Kafka en tiempo real, Big Data (spanish)

↧

Available new Open Source OLAP viewer, STPivot4

December 7, 2016, 9:38 am

≫ Next: Las predicciones de Pentaho para 2017

≪ Previous: 7 Ejemplos y Aplicaciones practicas de Big Data

STPivot4 is based on the old Pivot4J project where functionality has been added, improved and extended. These technical features are mentioned below.

GitHub STPivot4

For additional information, you may visit STPivot4 Project page at http://bit.ly/2gdy09H

Main Features:

STPivot4 is Pentaho plugin for visualizing OLAP cubes.
Deploys as Pentaho Plugin
Supports Mondrian 4!
Improves Pentaho user experience.
Intuitive UI with Drag and Drop for Measures, Dimensions and Filters
Adds key features to Pentaho OLAP viewer replacing JPivot.
Easy multi-level member selection.
Advanced and function based member selection (Limit, Ranking, Filter, Order).
Let user create "on the fly" formulas and calculations using
Non MDX gran totals (min,max,avg and sum) per member, hierarchy or axis.
New user friendly Selector Area
and more…

↧

Las predicciones de Pentaho para 2017

December 14, 2016, 9:49 am

≫ Next: Location Intelligence for Indoor Maps

≪ Previous: Available new Open Source OLAP viewer, STPivot4

Ver una Demo Online de Pentaho CE

Self-service data prep will unlock big data’s full value. Organizations building advanced, big data deployments like the ones needed to accurately predict election outcomes are buckling under huge, diverse data volumes. The amount of time spent simply preparing data is overwhelming organizations struggling for resources and time. This is often to the tune of anywhere between 50-70% of IT time spent preparing data. That sentiment data I mentioned only exacerbates this problem needing to be continually ingested from a huge universe of social network feeds and prepared for analysis. Self-service visualization tools that can only analyze data after it’s been prepared are diminishing in value. Our customer Sears Holdings does spot checks and visualizes its data throughout its lifecycle, which enables it to make more valuable data-driven decisions - in time for them to matter - while reducing costs. Expect more software vendors in 2017 to follow our lead and start offering tools that bridge the gap between analytics and data prep with an integrated experience for both.

Organizations are replacing self-service reporting with embedded analytics. As I first predicted in 2015, embedded analytics would become ‘the new BI’. We are now really starting to see our vision of ‘next generation applications” mature and replace self-service reporting. Organizations can see that analytics are an expectation and must be embedded at the point of impact regardless of the end-users sophistication. In our customer CERN’s case, this involves 15,000 users in various operational roles accessing Pentaho analytics from their normal line-of-business applications.

IoT’s adoption and convergence with big data will make automated data onboarding a requirement. This year predictive maintenance became a marquis use case for IoT’s ROI potential and this will continue to gather speed in 2017. Everything from shipping containers to oil-drilling screws to train doors is being fitted with sensors to track things like location, operating status and power consumption. And speaking of trains, expect to hear more about our project with Hitachi Rail to build ‘self-diagnosing’ trains that can detect if a problem is brewing on a train to either be taken out of service or repaired before the failure has taken place. In order to ingest, blend and analyze the massive volumes of data all these sensors generate, more businesses will need to be able to automatically detect and onboard any data type into its analytics pipeline. This is simply way too big, complex, fast-moving and mind-numbing a job for overburdened IT teams to handle manually

2017’s early adopters of AI and machine learning in analytics will gain a huge first-mover advantage in the digitalization of business. Big data and IoT use cases in business and industry are approaching the data variety, volume and velocity levels of large-scale scientific models for which AI and machine learning were originally conceived. Early adopters gain a jump start on the market in 2017 because they know that the sooner these systems begin learning about the contexts in which they operate, the sooner they will get to work mining data to make increasingly accurate predictions. This is just as true for the online retailer wanting to offer better recommendations to customers, a self-driving car manufacturer or an airport seeking to prevent the next terrorist attack.

Cybersecurity will be the most prominent big data use case. As with election polls, detecting cybersecurity breaches depends on understanding complexities of human behavior. Accurate predictions depend upon blending structured data with sentiment analysis, location and other data. BT’s Assure Cyber service, for example, uses Pentaho to help detect and mitigate complex and sustained security threats by blending event data and telemetry from business systems, traditional security controls, advanced detection tools among others.

↧

Location Intelligence for Indoor Maps

December 14, 2016, 10:03 am

≫ Next: A quick review of STPivot4 Open Source OLAP tool

≪ Previous: Las predicciones de Pentaho para 2017

Carto, herramienta de visualización Geoespacial de la que somos partners. En esta aplicación de análisis de tráfico en 'near real time', la podéis ver en funcionamiento junto a Pentaho, lanza una funcionalidad muy interesante:

Análisis Business Intelligence en ubicaciones (Location Intelligence) indoor (es decir, grandes oficinas, centros comerciales, universidades, edificios públicos o deportivos, etc...). Las posibilidades son enormes.

Nuestros compañeros de Carto nos indican:

"Indoor maps often direct users to emergency exits, which has limited our context of mapping to external geographical spaces. With the rise of Indoor Positioning Systems (IPS), however, the field of data visualization is turning inward to pioneer new paths to purchase with indoor maps.

Situm, a member of Telefónica’s Open Future initiative, and known as the “GPS for indoor” start-up, analyzes indoor traffic for various sectors using location intelligence. Despite an exponential rise in mobile purchasing, the Department of Commerce reports that 90 percent of retail purchases are transacted offline, which means managing in-store traffic is crucial to maintaining a competitive edge. But aside from providing directions for customers, what, exactly, can IPS offer? Well, as we learned during a recent collaboration with Situm, the answer is a lot"

↧

A quick review of STPivot4 Open Source OLAP tool

December 14, 2016, 10:25 am

≫ Next: Google open sources Embedding Projector for high-dimensional data

≪ Previous: Location Intelligence for Indoor Maps

STPivot4 Open Source OLAP tool

STPivot4 is based on old Jpivot and Pivot4J projects, now not in progress, where we´ve included, improved and strengthened many new functionalities mentioned below as technical features.

STPivot4 includes an innovative work space for selecting your query that allows end users work easily using drag and drop. End user can identify quickly which dimensions, measures or filters in order to work with them. Now, you can search, filter, rank and select in order to refine your queries as a first approach previous a query, avoiding waiting for long query response times.

Has been improved design, usability, graphs and, in summary, easy to use and manage for end users.

STPivot4 supports Mondrian 4, so it allows grant scalability, compliance and performance improvements and, working as a Pentaho plugin, working wih last available Pentaho versions.

Main Features an Download

You can download open source code from Github. We´ll be grateful of helping you in your Business Intelligence projects using Open Source tools if you need support, development and consultancy. We´d like to receive your feedback: info@stratebi.com

Cube Selector
We've created a new popup window where end users can easily select dimension values, measures, levels... for their queries. It includes a new search feature that improves value selection with high cardinality dimensions.
In your design window, end users can drag and drop their dimensions, filters and measures quickly and easily.

New search functionality
One of the best new features of STPivot is the ability of search dimension values easily, when you manage a great number of values.
This is very helpful when you need to identify your desired values on each level/dimension/hierarchy in order to include them in our query result.

Drag and Drop query design and build
If sometime you wanted to build your queries easily and quickly, with this visuall drag and drop design now it´s possible.

Filter and drill to detail
One of the best functionalities of any OLAP Viewer is the possibility of drill through any dimension and measure in order to get powerful insights about yor data models.

Advance Filters
It´s included advance filters within the Selector, so you can leverage all the power of OLAP cubes, refining your queries and nesting each filter.

Ranking Top Count
Ranking Bottom Count
Order
Visual Totals
Filter
Limit First/Last

Graphics and Visualization
STPivot includes a great variety of graphic libraries (pie, chart, heatmaps, line, bar...) fully configurable with popup information for any of your analytical needs.

Calculator
All the simplicity and power for end users, so they can directly create their own formulas with a friendly interface, in order to include them in their OLAP views.

Roadmap

We are working on new functionalities for STPivot. Some of them are listed below:

Creación de Formulas complejas
Creación de miembros calculados para uso en consultas
Analysis Wizard
What If
Undo Feature
Mejoras en usabilidad, diseño, resolución de problemas conocidos, etc...
New 'cool' ideas...

↧

Google open sources Embedding Projector for high-dimensional data

December 19, 2016, 5:45 am

≫ Next: iD v2 is now available on OpenStreetMap

≪ Previous: A quick review of STPivot4 Open Source OLAP tool

Good news for open source data visualization fans: Google open sources Embedding Projector for high-dimensional data

The tool will help machine learning researchers to visualize data without having to install and run TensorFlow.

Dimensionality, and vectors in general, is not something that most of us find easy to understand.

The problem is that we all live in a three-dimensional world. We are taught length, width and height, so we struggle to imagine what a forth, fifth or sixth dimension might look like — this is why most of us found Christopher Nolan’s representation of additional dimensions wonky in the movie Interstellar.

To enable a more intuitive exploration process, they e are open-sourcing the Embedding Projector, a web application for interactive visualization and analysis of high-dimensional data recently shown as an A.I. Experiment, as part of TensorFlow.

They are also releasing a standalone version at projector.tensorflow.org, where users can visualize their high-dimensional data without the need to install and run TensorFlow.

↧

iD v2 is now available on OpenStreetMap

December 19, 2016, 11:44 am

≫ Next: Santander y BBVA trasladan su competencia al Business Intelligence

≪ Previous: Google open sources Embedding Projector for high-dimensional data

The web-based iD editor is designed to help create an even better, more current OpenStreetMap by lowering the threshold of entry to mapping with a straightforward, in-browser editing experience.

Head over to OpenStreetMap and start editing today! You can make meaningful contributions with just a few minutes of training.

You can also help OpenStreetMap by donating to the OpenStreetMap Foundation’s 2016 funding drive. Donate today and your gift will go even further because Mapbox is matching €10,000 of donations.

Check out iD on Github to contribute code, make suggestions, or report an issue.

↧

Santander y BBVA trasladan su competencia al Business Intelligence

December 20, 2016, 7:35 am

≫ Next: New Search and Tags functionalities in Pentaho Console

≪ Previous: iD v2 is now available on OpenStreetMap

Tanto el Banco Santander como BBVA, trasladan su competitividad al Business Intelligence. Decimos Business Intelligence, en lugar de Big Data, como suelen promocionar, pues ambas aplicaciones, de momento, tienen más de lo primero que de lo segundo. Probablemente, con el tiempo usen más de lo segundo

La cuestión es: Tendrá éxito realmente entre los comercios? Están preparados y formados para usar herramientas de Business Intelligence?

Os contamos:

La de Santander se llama: Mi Comercio

Mi Comercio cuenta con tres funcionalidades básicas:

‘Mi Facturación’ recoge las totalizaciones realizadas por los TPVs en los últimos 15 días, incluyendo el detalle de estas operaciones.
‘Mis Clientes’ recopila mensualmente datos agregados de aquellos clientes nuevos y recurrentes que han comprado en el comercio y en los de la competencia cercana. Con esta información, las empresas y los autónomos pueden tomar decisiones de negocio al acceder a información como la hora del día a la que más compran sus clientes, si están captando más clientela que su competencia, en qué otros sectores de actividad suelen comprar las personas que acuden a sus negocios, etcétera.
‘Ayuda y Soporte’, responde a las preguntas más frecuentes de los clientes y ofrece los teléfonos de atención para los usuarios de TPVs a un solo click.

La del BBVA se llama: Commerce 360

Accede mes a mes a los datos de compras de tu TPV BBVA y compáralos con la actividad comercial de las empresas de tu zona y sector para tomar decisiones útiles para tu negocio.
Te ofrece datos objetivos sobre de la fidelidad de tus clientes, de sus segmentos demográficos y de sus principales códigos postales de procedencia.
Compara estos indicadores con los de tu zona para identificar oportunidades de mejora en horarios comerciales, precios o acciones de marketing.
Todo esto sin coste por tener el TPV con BBVA.

↧

New Search and Tags functionalities in Pentaho Console

December 21, 2016, 2:13 am

≫ Next: Conoce las novedades de Jedox 7 en este video

≪ Previous: Santander y BBVA trasladan su competencia al Business Intelligence

Hi, if you are a Pentaho user or Admin, and you are managing a 'production environment' where the number of folders, reports, analysis and Dashboards increase day by day it's very useful a way to quickly identify the right element you want to open.

That´s why we´ve created this component that allows you to:

- Search by folder
- Add tags and comments for any element
- Search by any word of title, tags, and comments
- Select by any tag
- Search by date of creation or modification
- Filter by type of element: Report, OLAP or Dashboard

You can see in action here in this Online Demo

Select by Date of creation or modification

Select by type of element, tag, date and text search

Add tags and description

↧

Apache Kylin

Druid

Apache Lens

STPivot4 Open Source OLAP tool

Main Features an Download

Cube Selector

New search functionality

Drag and Drop query design and build

Filter and drill to detail

Advance Filters

Graphics and Visualization

Calculator

Roadmap