Datafeed Design / Considerations (SAFEQ Cloud)

Module 2 of Intuitive Solution Expertise

Now that we know about how we get data through the API into a series of tables, we can start to make use of them in our datafeeds. 

Let’s edit one of feeds. In this case, we’ll start with ‘Document History’. Opening the SQL, we’ll scan right down to the bottom so we can see what we’re selecting, and in what quantity. 

We do use each of the tables, as they each have a specific function. The bread and butter of print data is held within the Document_History table. This contains most of the data used to populate the SAFEQ Cloud dashboards. 

There’s a few left joins going on to attach the other supplementary tables to the job level details. 

We first start with the join to the document_history_department, this is a table we create to log the department and tag against that user / print job at that time.  

When a user is created, they can be assigned to a single department, but also multiple tags. Each tag represents a new column in the table, meaning if a user has assigned themselves to 100 tags, they will have 100 extra columns in the document_history table for everyone, no matter if other users do not use the full 100 tags. 

This creates a problem for us, both in terms of table sizes, but also the readability and usability of the data in this format. 

We split out these tags into single rows to build up the document_history_department table, and then remove it from the document_history table to keep it tidy. 

Next table we join is the department table. This follows on from the document_history_department table to build up a concise list of departments and tags against a single user. 

Outputs gives the list of print devices configured inside of the SAFEQ Cloud software. 

Account Containers list the Containers created in that customers’ tenant. 

Locations are the configured Locations within SAFEQ Cloud 

Before we move onto the content of the SQL, it’s also worth sticking here to explain the WHERE clauses. We ordinarily retrieve a rolling two years from the SQLite tables, and that’s based on the dateTime attribute from the Document_History table. The basic logic of this clause is that if the date of the jobs is between the date of ‘now’ and two years ago, then report on it. Anything outside of that range is therefore not loaded into the datafeed. 

Additionally, we filter on an attribute called ‘status’ which just lists a number between 0-9 currently. That just gives you an indication of the current status of that job. Whether its been printed, deleted, expired, failed and so on. 

We just show jobs with an ID of 1, meaning it’s been printed and is not showing as any other status within SAFEQ Cloud. 

And finally, we also group and order the data by various IDs. It’s largely unchanged to how the data is presented to us in the table, but as we perform sums in the data, we still need to group by the ID so we show the individual job level detail. 

Let’s move onto the main content of the SQL, the first section compromises of the ID’s we select, followed by the time based aggregation. 

The time section requires some explanation, but the main function of this part of the SQL is to calculate the number of months elapsed since that job took place, compared to now. 

As the calculation is quite long, if you'd like the "month_text" calculation for reference, please refer to your own SAFEQ Cloud implementation, or alternatively, reach out to support@weareintuitive.com to ask for a copy.

We then figure out the month that job took place, give it a three letter abbreviation and put the elapsed months and text together. 

Then we include the rest of the date fields, such as the raw timestamp, the D-M-Y of the timestamp and the Y-M of the timestamp. 

We also generate some Boolean-style flags for the dates we receive. They use similar logic to the WHERE clause at the bottom of the SQL for grabbing dates between a certain range. This includes, but is not limited to: 

  • Dates in the last year or two 
  • Dates in the last day or two 
  • Dates in the last week or two 

Next are the user and account details, followed by the document details. 

We form our own Extensions values based on either the job type (To categorise scans, faxes etc.) or by viewing the name of the document itself.  

We then display the device details (based on our outputs table) including the name, location and serial number. 

Next up are the volumes, which are a case of various sums based on either the raw volume, colour, mono and duplex pages. 

Costs are a unique Intuitive attribute. They don’t exist in SAFEQ Cloud, but we create a supplementary table of global click cost metrics for the colour, mono and scan pages.  

We won’t cover the cost metrics in this module, but they’re used in this SQL logic to multiply by the pages of the job.  

You’ll notice that the structure of the costs has quite a few where clauses. This is due to the complexity of the cost table, as customers can opt-in to A3 page doubling and scan pricing. This means that there’s a few use-cases we need to account for, which are expressed by conditions satisfying a ‘Y’ for ‘Yes’ or an ‘N’ for ‘No’ to those cases. 

Costs actually have a similar logic to the volumes, where we calculate costs based on either the total sum of cost, or the individual mono and colour cost. 

Next, we have the figures for our Supplementary Document Information, this includes the different flags for duplex documents and different ways of forming that data. Either by the flag Boolean or through the text equivalent. There’s the flags for the document in colour or mono, and other document attributes, such as the 'paperSize' and whether it was stored online. 

After that, you can also see the different status’ that SAFEQ Cloud gives us, with a key of what each of those status codes mean. It’s worth mentioning at this point that this is a bit of a legacy piece of information, as if you recall to the start of this module, we only show documents with a status of ‘1’, so we’re only going to see documents that have been ‘Printed’ or really, they’ve been ‘completed’.  

In most of our solutions, we include what we call a ‘NULL Parameter’. This is a flexible attribute for our datasets to use inside of components. It’s just a blank field that we can redefine based on whatever we need to show for that given purpose. For example, the datafeed may use the NULL Parameter, and where it’s blank (i.e. on every row), use some filler text for ‘Show All’ or ‘Show All Users’ etc. 

Whilst that’s a lot to cover for a single feed, the breadth of the solutions data aggregation is performed here, so it’s worth getting to grips with what this feed is doing, and especially how you may change it based on customer requirements. Think about circumstances where customers may want to show data in the last 3 days, 8 months, or 4 years.

How would you introduce those flags? Well, you can use ours as a basis, and add new lines or modify existing ones to suit your purposes. 

Let’s open the rest of our feeds, starting with ‘Filter Print Log Attributes’. You’ll notice that the rest of the feeds have a ‘Filter’ prefix. This is because we use them as dedicated aggregation points to improve on filtering for our dashboards. 

The filter components use these feeds to speed up load times, as the software behind the dashboards needs to work out the unique attributes for particular fields. So let’s say we design a filter to view the Departments in the data. If we used the Document History, the software needs to scan through potentially hundreds of thousands of lines of tables to find potentially less than a hundred unique department names. 

Therefore, if we have a dedicated filter datafeed, we can reduce the number of rows in the table so the software does not have to do as much heavy lifting when users are filtering on dashboards. 

Getting back to the feed, we can open the SQL to see what’s going on. 

The main focus of this feed is to provide filter attributes relating to the actual print data. 

We select attributes such as the job type, paper size and duplex parameters. 

So if you wanted to add additional filter parameters, and it relates to the documents themselves, this would be the best place to add it. 

One thing to be cautious of is when you’re introducing very unique attributes, such as IDs or timestamps. Both of those types of parameters can completely bloat the feed by adding multiple lines to every unique combination of attribute onscreen. 

If we look at the feed results as they are now, we see a combination of every single parameter relating to the data we’re getting from this customer.

So if we take this top line, it means there’s a job / jobs that contain the combination of these values. So if we introduce an ID into the mix, what we’ll end up with is every iteration of this unique combination, multiplied by the number of times this has occurred against that Job ID, or Device ID or Account Container ID etc.

If you wanted to test this out, on your own installation of Intuitive for SAFEQ Cloud, try validating this feed and note the time taken to validate, then add an ID of your choosing into this feed (don’t forget to group it!) and validate again. You may notice that the time taken to form this feed has changed, but did you find it to improve or deteriorate? 

Let’s move onto the next feed ‘Filter Printer Attributes’. 

This is much the same as the previous filter, but instead of being document focused, its area of interest is the device estate. 

It’s worth noting at this point, that you may have noticed the feeds start to share particular attributes. We’ll cover this in a bit more detail for the dataset module, but the short explanation is that we need some similarities between them so we can accurately filter and link data together. Typically this is time or ID based.

Back to the ‘Filter Printer Attributes’ feed. SAFEQ Cloud gives us some detailed data, but like an on-premise SAFEQ 6 system, we do not receive the model type / family, just its serial number, ID and name. 

The last feed to take a look at is ‘Filter Department Attributes’. 

As the name suggests, this surrounds the Departments that users are assigned to, but also some supplementary info on the SAFEQ Cloud account containers and locations. 

If you recall to the previous modules and architecture of the solution, these feeds are reading from tables inside of a SQLite database. Therefore the SQL onscreen is unique to a SQLite database. This means that any functions we use are native to SQLite, and may not have the intended effect when translated into its SQL Server equivalent. 

SQLite has a wealth of functions available to you at the vendors’ website:

https://sqlite.org/lang_corefunc.html