This chapter expands on the basic operations shown in by showing how to do much more complex searches across the data. It also demonstrates a few useful external modules available on CPAN and how to do transactions and locking.
You should understand basic DBIx::Class use as shown in Chapter
4. While the SQL used in this chapter is explained as we go, it will
be helpful if you already know how to use the keywords JOIN
, GROUP BY
, ORDER BY
and so on. We will be giving code examples and tests
using Test::More so you should be familiar with Perl unit testing. The
database we are using is provided as an SQL file you can import into
an SQLite database1 to get started.
Download the skeleton code for this chapter:
In Chapter 4 we covered simple queries to fetch single rows from the
database, change and delete the data. Now we're going to use the
database to prefilter, sort, slice and dice the data for us, as this
is more efficient than fetching the data a piece at a time and doing
the work in Perl. DBIx::Class allows you to use Perl data structures
and methods to describe the intended query, and then optimises the
result into SQL. The main method we will be using is search
, with
various conditions and attributes.
In Chapter 4 we searched for a set of users with rude or unwanted
words as their realnames, in order to remove them from the
database. This uses the search
method and filters the results using
the -like
comparison operator.
LIKE
is an SQL keyword used to compare data against simple wildcard
matching. %
matches any number of characters, _
matches a single
character. Note that while LIKE is defined as being case-insensitive
in the SQL standard, it is not implemented as such in all databases
(PostgreSQL for example, provides the LIKE and ILIKE keywords).
The two arguments to search
are a set of conditions and a set of
attributes. The conditions supply the filters to apply to the data,
the attributes add grouping, sorting and joining and more. The
conditions are turned into SQL using a module called
SQL::Abstract2
The following are some tips on creating SQL conditions using the SQL::Abstract syntax. Remember that in the DBIC_TRACE output, the '?' characters represent bind parameters, the actual values are shown after the query, and will be escaped before being used in the SQL.
-
A hashref produces a set of AND'd conditions:
Perl: search({ username => 'fredbloggs', email => '[email protected]' });
SQL: ... WHERE username = ? AND email = ? : 'fredbloggs', '[email protected]'
-
An arrayref produces a set of OR'd conditions:
Perl: search([ { username => 'fredbloggs' }, { email => '[email protected]' } ]);
SQL: ... WHERE username = ? OR email = ? : 'fredbloggs', '[email protected]'
-
We can mix and match these:
Perl: search([ { username => 'fredbloggs', realname => 'Fred Bloggs' }, { email => '[email protected]' } ]);
SQL: ... WHERE (username = ? AND realname = ?) OR email = ? : 'fredbloggs', 'Fred Bloggs', '[email protected]'
-
To OR a set of alternate values for one column, we can pass an arrayref as the hashref value:
Perl: search({ username => [ 'fredbloggs', 'alicebloggs', 'joebloggs' ] });
SQL: ... WHERE username = ? OR username = ? OR username = ? : 'fredbloggs', 'alicebloggs', 'joebloggs'
-
To produce SQL-keywords, we generally precede them with an '-' symbol, so to get IN:
Perl: search({ username => { '-in' => [ 'fredbloggs', 'alicebloggs', 'joebloggs' ] } });
SQL: ... WHERE username IN ( 'fredbloggs', 'alicebloggs', 'joebloggs' )
-
Or BETWEEN:
Perl: search({ created_date => { '-between' => ['2012-01-01', '2012-12-31'] } });
SQL: ... WHERE created_date BETWEEN ? AND ? : '2012-01-01', '2012-12-31'
-
So to recap from the previous chapter:
my @badwords = ('john', 'joe', 'joseph', 'jess', 'james'); my $badusers_rs = $users_rs->search({ realname => [ map { { 'like' => "%$_%"} } @badwords ], });
Creates a list of OR'd LIKE queries:
SELECT me.id, me.realname, me.username, me.password, me.email
FROM users
WHERE me.realname LIKE '?'
OR me.realname LIKE '?'
OR me.realname LIKE '?'
OR me.realname LIKE '?
OR me.realname LILE '?'
: '%john%', '%joe%', '%joseph%', '%jess%', '%james%'
We can also call functions, concatenate values and so on. The rest of
this chapter will demonstrate more search conditions, and the possible
contents of the second argument to search
, for attributes.
A default search
call will fetch all the columns defined in the
ResultSource that we're using for the search. Note that the
ResultSource itself does not need to define all the columns in a
database table. If you don't need to use some of them in your
application at all, you can leave them out of the class definition.
You may want to reduce the set of columns fetched from the database,
useful if one of them is a large blob type column and you don't always
need the data. The columns
attribute replaces the default list with
the supplied arrayref of column names.
To fetch the user data without the password column:
my $schema = MyBlog::Schema->connect("dbi:mysql:dbname=myblog", "myuser", "mypassword");
my $users_minus_passwd_rs = $schema->resultset('User')->search({
}, {
columns => [ qw/me.id me.realname me.username me.email/ ],
});
The SQL created by DBIx::Class will always use a table
alias3 to identify the tables in the query. The table
the search is based upon is always aliased as me
.
To better express the "all but the password column" we can fetch the list of defined columns from the ResultSource, and subtract the column:
my $schema = MyBlog::Schema->connect("dbi:mysql:dbname=myblog", "myuser", "mypassword");
my $users_rs = $schema->resultset('User');
my $users_minus_passwd_rs = $users_rs->search({
}, {
columns => [ grep { $_ ne 'password' } ($users_rs->resultsource->columns) ],
});
The SQL we get for both of these is the same:
SELECT me.id, me.realname, me.username, me.email
FROM users me
Note, if this is a query you are likely to repeat, or you just want to keep your code tidier, you can store it as a predefined method on the ResultSet object. See for how to do this.
The SQL SELECT
clause can contain many other things, for example
functions such as length
. To output a function and its arguments,
use a hashref in the columns
attribute. You can also add more columns
to the default set using the +columns
attribute.
my $schema = MyBlog::Schema->connect("dbi:mysql:dbname=myblog", "myuser", "mypassword");
my $users_rs = $schema->resultset('User');
my $users_plus_emaillen_rs = $users_rs->search({
}, {
'+columns' => [ { 'emaillen' => { length => 'me.email' } }],
});
Which produces the SQL:
SELECT me.id, me.realname, me.username, me.password, me.email, length(me.email)
FROM users me
The outer level of the column hashref has the new internal column name
as its key. This gives you a way to access to resulting value. Note
that we'll need to use the get_column
method to fetch the data as it
does not create a new accessor method on the resulting Row object.
while (my $user = $users_plus_emaillen_rs->next) {
print "User: ", $user->username, " has email length ", $user->get_column('emaillen'), " \n";
}
Later on in the we'll show how to include data from related tables, and select entire sets of related data in the same single query.
This is a straight-forward practice of the techniques just
described. Write a piece of code that fetches all the posts entries,
but just the title
and created_date
columns, not the actual
post
content.
You can find the skeleton of this test in the downloadable code, in the t/no-post-content.t file.
#!/usr/bin/env perl
use strict;
use warnings;
use Test::More;
use_ok('MyBlog::Schema');
unlink 't/var/myblog.db';
my $schema = MyBlog::Schema->connect('dbi:SQLite:t/var/myblog.db');
$schema->deploy();
## insert some test data
my $users_rs = $schema->resultset('User');
$users_rs->create({
realname => 'John Smith',
username => 'johnsmith',
password => Authen::Passphrase::SaltedDigest->new(
algorithm => "SHA-1",
salt_random => 20,
passphrase => 'johnspass',
),
email => '[email protected]',
posts => [
{
title => "John's first post",
post => 'Tap, tap, is this thing on?',
created_date => DateTime->now,
},
{
title => "John's second post",
post => "Anybody out there?",
created_date => DateTime->now,
}
],
});
my $posts_no_content_rs;
## Your code goes here!
## End your code
## Tests:
is($posts_no_content_rs->count, 2, 'Found both posts');
my $first_post = $posts_no_content_rs->next();
my %post_data = $first_post->get_columns;
is(scalar keys %post_data, 2, "Got two columns in the first post");
ok($first_post->title, 'Got a title on the first post');
ok($first_post->created_date, "Got a created_date on the first post");
ok(!$first_post->post, 'No post content on first post');
done_testing;
The SQL language provides a number of ways to manipulate the data before building rows out of the results. Doing this work in the database instead of your Perl code is much more efficient.
First we look at ordering or sorting the results, and reducing the number of rows returned. We can sort results based on any data values in the database, or manipulations of those values. The type of sorting we get depends on the data types in the columns. For example, we can fetch all our users ordered by their usernames for display.
The order_by
attribute corresponds to the SQL keyword ORDER BY
.
Sorting can be done either ascending, with -asc
, or descending with
-desc
. If neither asc or desc are supplied, the database default is
usually ascending. With no ORDER BY
clause at all, the return order
of the results is undefined.
my $schema = MyBlog::Schema->connect("dbi:mysql:dbname=myblog", "myuser", "mypassword");
my $users_rs = $schema->resultset('User')->search({
},
order_by => { '-desc' => ['me.username'] }
});
We get the SQL:
SELECT me.id, me.realname, me.username, me.password, me.email
FROM users me
ORDER BY me.username DESC
The results from our loop over the results are now sorted by username. There's no need to sort further in Perl or your templating system.
Now that we have rows in a known order, we can also reduce the set of
results to a top ten, or to a page worth of results, so that we only
fetch data we'll actually use. There are unfortunately many different
database-specific implementations of ways to reduce the number of rows
returned by an SQL query. Luckily DBIx::Class abstracts these away
for you, providing a single rows
attribute that is converted to the
correct keyword(s) when the query is run.
The first 'page' of 10 usernames to display:
my $schema = MyBlog::Schema->connect("dbi:mysql:dbname=myblog", "myuser", "mypassword");
my $users_rs = $schema->resultset('User')->search({
}, {
order_by => { '-desc' => ['me.username'] },
rows => 10,
});
On SQLite and MySQL we get the SQL:
SELECT me.id, me.realname, me.username, me.password, me.email
FROM users me
ORDER BY me.username DESC
LIMIT 10
Which returns a maximum of 10 rows. If the unlimited query would return fewer than 10 rows, then we just get those. No error is thrown.
SQL even provides a way to return a specified subset of results from
within the entire set, so we can fetch a second page of results
precisely. DBIx::Class implements this by providing the page
attribute to select which page of results to return. This defaults to
1 if not supplied.
The second 'page' of 10 usernames to display:
my $schema = MyBlog::Schema->connect("dbi:mysql:dbname=myblog", "myuser", "mypassword");
my $users_rs = $schema->resultset('User')->search({
}, {
order_by => { '-desc' => ['me.username'] },
rows => 10,
page => 2,
});
On SQLite and MySQL we get the SQL:
SELECT me.id, me.realname, me.username, me.password, me.email
FROM users me
ORDER BY me.username DESC
OFFSET 10
LIMIT 10
Another good use of sorting is to sort things by datetime values to display them in the order that they happened. Now combine search conditions with attributes, and find: 1) all the posts by the user fredbloggs in the order they were created, 2) the 2nd "page" of 2 posts per page.
This test can be found in the file t/ordered-posts.t.
#!/usr/bin/env perl
use strict;
use warnings;
use Authen::Passphrase::SaltedDigest;
use Test::More;
use_ok('MyBlog::Schema');
package Test::ResultSet;
use strict;
use warnings;
use base 'DBIx::Class::ResultSet';
__PACKAGE__->mk_group_accessors('simple' => qw/method_calls/);
sub new {
my ($self, @args) = @_;
$self->method_calls({});
$self->next::method(@args);
}
## Count how many times search is called
sub search {
my ($self, @args) = @_;
$self->method_calls->{search}++;
$self->next::method(@args);
}
package main;
unlink 't/var/myblog.db';
my $schema = MyBlog::Schema->connect('dbi:SQLite:t/var/myblog.db');
$schema->deploy();
foreach my $source ($schema->sources) {
$schema->source($source)->resultset_class('Test::ResultSet');
}
my $users_rs = $schema->resultset('User');
## Add some initial data:
my @users = $users_rs->populate([
{
realname => 'Fred Bloggs',
username => 'fred',
password => Authen::Passphrase::SaltedDigest->new(algorithm => "SHA-1", salt_random => 20, passphrase=>'mypass'),
email => '[email protected]',
posts => [
{ title => 'Post 4', post => 'Post 4 content', created_date => '2012-04-01 10:00:00' },
{ title => 'Post 3', post => 'Post 3 content', created_date => '2012-03-01 10:00:00' },
{ title => 'Post 2', post => 'Post 2 content', created_date => '2012-02-01 10:00:00' },
{ title => 'Post 1', post => 'Post 1 content', created_date => '2012-01-01 10:00:00' },
{ title => 'Post 5', post => 'Post 5 content', created_date => '2012-05-01 10:00:00' },
{ title => 'Post 6', post => 'Post 6 content', created_date => '2012-06-01 10:00:00' },
],
},
### 1) Posts by fred in order
## Your code goes here:
my $ordered_rs;
## Your code end
is($users_rs->method_calls->{search}, 1, 'Called "search" just once');
is($ordered_rs->count, 6, 'Found 6 posts');
foreach my $i (1..6) {
my $row = $ordered_rs->next;
ok($row->isa('MyBlog::Schema::Result::Post'), 'Result isa Post object');
is($row->title, "Post $i", "Post $i returned in order");
}
## 2) 2nd page of posts by fred, 2 per page
## Your code goes here:
my $ordered_page_rs;
## Your code end
is($users_rs->method_calls->{search}, 2, 'Called "search" a second time');
is($ordered_page_rs->count, 2, 'Found page-worth of posts (2)');
foreach my $i (3,4) {
my $row = $ordered_rs->next;
ok($row->isa('MyBlog::Schema::Result::Post'), 'Result isa Post object');
is($row->title, "Post $i", "Post $i returned in order");
}
done_testing;
The solution is available in the download if you need to refer to it.
We've seen how to create related rows, either singly or together with
other data. Now we can look at how to query or fetch all that data
without making multiple trips to the database. The SQL keyword JOIN
can be produced using the attribute key join
and providing it a list of
related resultsources to join on. The joined tables can be accessed in
the search conditions and other clauses, using the name of the
relationship.
On its own, the join
clause will not produce any change in the
query, as it will be optimised away by DBIx::Class. We need to make at
least some use of the joined table's columns. So to demonstrate it we
also introduce the group_by
attribute key, which produces the GROUP BY
keyword in SQL.
Grouping the results is a useful technique which allows us to apply
things like mathematical functions to the content. For example we can
group all of the posts written by each user together, and then apply
the COUNT
function. The result will contain the count of the set of
rows in each group, otherwise know as the number of posts written by
the user.
Here is the concrete example. We fetch the usernames together with the number of posts that they have written:
my $schema = MyBlog::Schema->connect("dbi:mysql:dbname=myblog", "myuser", "mypassword");
my $users_post_count_rs = $schema->resultset('User')->search({
}, {
'columns' => [ 'me.username', { post_count => { count => 'posts.id' } }],
group_by => [ 'me.username' ],
join => ['posts'],
});
The columns
technique is the same one we saw earlier in
, and count
just calls the SQL
built in aggregate count function. This only works if group_by
is
used so that it knows what to count against.
We get the SQL:
SELECT me.username, count(posts.id)
FROM users me
LEFT JOIN posts posts ON me.id = posts.id
GROUP BY me.username
Let's look at some results to see what is going on:
+------------+------------+ | username | post_count | +============+============+ | fredbloggs | 2 | +------------+------------+ | joebloggs | 3 | +------------+------------+
Table: Count of posts per user
So, how does that work? If we run the same query, just fetching the usernames and post IDs, with no counting, we can see the raw data:
my $schema = MyBlog::Schema->connect("dbi:mysql:dbname=myblog", "myuser", "mypassword");
my $users_posts_rs = $schema->resultset('User')->search({
}, {
'columns' => [ 'me.username', 'posts.id' ],
join => ['posts'],
});
We get the SQL:
SELECT me.username, posts.id
FROM users me
LEFT JOIN posts posts ON me.id = posts.id
The raw uncounted results:
+------------+------------+ | username | posts.id | +============+============+ | fredbloggs | 1 | +------------+------------+ | fredbloggs | 3 | +------------+------------+ | joebloggs | 2 | +------------+------------+ | joebloggs | 4 | +------------+------------+ | joebloggs | 5 | +------------+------------+
Table: Individual posts per user
So the grouping is collapsing the username
column by each set of
usernames, and the count
function adds up the number of rows per
set.
Note that if we want to fetch more than the username
column from the
users table, we need to list those columns in the group_by
column
list. This is part of the SQL standard for group by, to ensure that
the results are not ambiguous. Most databases enforce this and will
throw an error if you choose to select columns which are not in the
grouping list, nor have an aggregate function applied to them. Some,
such as MySQL, will not warn by default.
The sum
, avg
, min
, and max
aggregate functions work just like
the count
function. Use this information to create a query that will
return the earliest post (minimum created_date
) of each user in the
database.
This test can be found in the file t/earliest-posts.t.
#!/usr/bin/env perl
use strict;
use warnings;
use Authen::Passphrase::SaltedDigest;
use Test::More;
use_ok('MyBlog::Schema');
package Test::ResultSet;
use strict;
use warnings;
use base 'DBIx::Class::ResultSet';
__PACKAGE__->mk_group_accessors('simple' => qw/method_calls/);
sub new {
my ($self, @args) = @_;
$self->method_calls({});
$self->next::method(@args);
}
## Count how many times search is called
sub search {
my ($self, @args) = @_;
$self->method_calls->{search}++;
$self->next::method(@args);
}
package main;
unlink 't/var/myblog.db';
my $schema = MyBlog::Schema->connect('dbi:SQLite:t/var/myblog.db');
$schema->deploy();
foreach my $source ($schema->sources) {
$schema->source($source)->resultset_class('Test::ResultSet');
}
my $users_rs = $schema->resultset('User');
## Add some initial data:
my @users = $users_rs->populate([
{
realname => 'Fred Bloggs',
username => 'fred',
password => Authen::Passphrase::SaltedDigest->new(algorithm => "SHA-1", salt_random => 20, passphrase=>'mypass'),
email => '[email protected]',
posts => [
{ title => 'Post 1', post => 'Post 1 content', created_date => DateTime->new(year => 2012, month => 01, day => 01) },
{ title => 'Post 2', post => 'Post 2 content', created_date => DateTime->new(year => 2012, month => 01, day => 03) },
],
},
{
realname => 'Joe Bloggs',
username => 'joe',
password => Authen::Passphrase::SaltedDigest->new(algorithm => "SHA-1", salt_random => 20, passphrase=>'sillypassword'),
email => '[email protected]',
posts => [
{ title => 'Post 3', post => 'Post 3 content', created_date => DateTime->new(year => 2012, month => 01, day => 05) },
{ title => 'Post 4', post => 'Post 4 content', created_date => DateTime->new(year => 2012, month => 01, day => 07) },
],
}
]);
### Users and their earliest posts
## Your code goes here:
my $earliest_rs;
## Your code end
is($users_rs->method_calls->{search}, 1, 'Called "search" just once');
is($earliest_rs->count, 2, 'Found 2 users');
my $row = $earliest_rs->next;
ok($row->isa('MyBlog::Schema::Result::User'), 'Found user objects');
ok($row->username eq 'fred' && $row->created_date->ymd eq '2012-01-01'
|| $row->username eq 'joe' && $row->created_date->ymd eq '2012-01-05',
'Found earliest post, 1st user');
$row = $earliest_rs->next;
ok($row->username eq 'fred' && $row->created_date->ymd eq '2012-01-01'
|| $row->username eq 'joe' && $row->created_date->ymd eq '2012-01-05',
'Found earliest post, 2nd user');
done_testing;
With and the various aggregation functions we can now sum or count data across groups of rows. Suppose we now want to restrict the results of our query to only those post authors that have written at least one post. The way to do this that feels logical is to add to the WHERE clause:
my $schema = MyBlog::Schema->connect("dbi:mysql:dbname=myblog", "myuser", "mypassword");
my $users_post_count_rs = $schema->resultset('User')->search({
'post_count' => { '>' => 1 },
}, {
'+columns' => [ { post_count => { count => 'posts.id' } }],
group_by => [ $users_rs->resultsource->columns ],
join => ['posts'],
});
Note that we're using +columns
again, to add the count column to the
default list from the User
source. In order to create a legal GROUP BY
clause it needs to contain all the columns in the query that are
not being aggregated, so we use the list of columns
from the
ResultSource.
SQL:
SELECT me.id, me.realname, me.username, me.password, me.email, count(posts.id)
FROM users me
LEFT JOIN posts ON me.id = posts.id
WHERE post_count > ?
GROUP BY me.id, me.realname, me.username, me.password, me.email : 1
This doesn't work.
It doesn't work for two reasons, firstly, the WHERE clause post_count > 1
doesn't make any sense as it uses the unknown column
post_clause
. To name results of aggregate functions in SQL, we need
to add an alias for it, using AS
, like this:
my $schema = MyBlog::Schema->connect("dbi:mysql:dbname=myblog", "myuser", "mypassword");
my $users_post_count_rs = $schema->resultset('User')->search({
'post_count' => { '>' => 1 },
}, {
'+columns' => [ { post_count => { count => 'posts.id' },
-as => 'post_count',
} ],
group_by => [ $users_rs->resultsource->columns ],
join => ['posts'],
});
SQL:
SELECT me.id, me.realname, me.username, me.password, me.email, count(posts.id) AS post_count
FROM users me
LEFT JOIN posts ON me.id = posts.id
WHERE post_count > 1
GROUP BY me.id, me.realname, me.username, me.password, me.email : 1
The second reason is that the SQL clauses are run in order. The WHERE clause comes before the GROUP BY clause (because that's just the way SQL is defined, and the database will complain), and thus the grouped count will not yet exist when we try to filter it in the WHERE.
To achieve our goal we need to use a new SQL keyword, HAVING
. This
clause comes after the GROUP BY
, and thus can be used to filter the
grouped results:
So to actually return all users that have at least one post:
my $schema = MyBlog::Schema->connect("dbi:mysql:dbname=myblog", "myuser", "mypassword");
my $users_with_as_least_one_post_rs = $schema->resultset('User')->search({
}, {
'+columns' => [ 'me.username', { post_count => { count => 'posts.title',
-as => 'post_count',
}
],
group_by => [ $users_rs->resultsource->columns ],
join => ['posts'],
having => [ { 'post_count' => { '>=' => 1 } } ],
});
We get the SQL:
SELECT me.id, me.realname, me.username, me.password, me.email, count(posts.id) AS post_count
FROM users me
LEFT JOIN posts ON me.id = posts.id
GROUP BY me.id, me.realname, me.username, me.password, me.email
HAVING post_count <= ? : 1
WARNING: If you get an error from your database here, and it is an Oracle
or MS SQL Server, then you will need to use different code. As these
databases do not parse/run the SQL in order, the post_count
alias is
still unknown while parsing the HAVING
clause. We need to repeat the
condition instead:
my $users_with_as_least_one_post_rs = $schema->resultset('User')->search({
}, {
'+columns' => [ { post_count => { count => 'posts.title' },
}
],
group_by => [ $users_rs->resultsource->columns ],
join => ['posts'],
having => \[ 'count(posts.id) >= ?', [ {} => 1 ] ],
});
The having
clause now contains a piece of literal SQL. We could
write it without the arrayrefs and the placeholder, but its always
preferable to use placeholders in SQL wherever possible, to avoid SQL
injection attacks. The syntax allows for a piece of SQL, containing
placeholders, which will be inserted literally into the clause, and
then an arrayref for each placeholder value. The placeholder values
contain a hashref which can be used to define the sql datatype, but
for now we just supply an empty one, and a plain value of 1
.
The whole thing is then taken as a reference, so we get a reference to an arrayreference, so that SQL::Abstract will know it is for literal SQL, and not some other construct. It's a bit ugly, but also quite exact.
So for Oracle and MS SQL Server we get the SQL:
SELECT me.id, me.realname, me.username, me.password, me.email, count(posts.id)
FROM users me
LEFT JOIN posts ON me.id = posts.id
GROUP BY me.id, me.realname, me.username, me.password, me.email
HAVING count(posts.id) >= ? : 1
Joins can also be used to fetch multiple tables worth of data in the same query, eliminating extra trips to the database. To illustrate, if we fetch a user object and the first page worth of posts written by that user, we generate the following:
my $user = $schema->resultset('User')->find({ username => 'fred' });
# Output fred's posts:
my $posts = $fred->posts->search({}, { rows => 10, page => 1 });
while (my $post = $fred->next) {
print $post->title, " ", $post->post, "\n";
}
The SQL generated:
SELECT me.id, me.realname, me.username, me.password, me.email
FROM users me
WHERE me.username = 'fred'
SELECT me.id, me.user_id, me.created_date, me.title, me.post
FROM posts
WHERE me.user_id = 1
LIMIT 10
Or worse, we fetch a page of posts from different users for our frontpage, and then fetch the user details, we get a query per user:
my $posts_rs = $schema->resultset('Post')->search({}, { rows => 10, page => 1 });
# Output all posts:
while (my $post = $posts_rs->next) {
print $post->user->username, " ", $post->title, " ", $post->post, "\n";
}
The SQL generated:
SELECT me.id, me.user_id, me.created_date, me.title, me.post
FROM posts
LIMIT 10
SELECT me.id, me.realname, me.username, me.password, me.email
FROM users me
WHERE me.id = 1
SELECT me.id, me.realname, me.username, me.password, me.email
FROM users me
WHERE me.id = 2
SELECT me.id, me.realname, me.username, me.password, me.email
FROM users me
WHERE me.id = 2
SELECT me.id, me.realname, me.username, me.password, me.email
FROM users me
WHERE me.id = 1
.. and so on, one for each post, even if some are written by the same user. Of course we could reduce it by caching the user objects so that we don't refetch duplicates.
We can reduce this set of queries (in either case), by using the
prefetch
attribute to our initial query, which asks it to include
all the listed relations into the query.
Note, this currently does not work with multiple has_many
type
relations at the same level, as decoding the resulting data back into
objects is tricky.
So for users and the first page of posts:
my $user = $schema->resultset('User')->search(
{
username => 'fred',
},
{
prefetch => ['posts'],
rows => 10,
page => 1,
});
# Output fred's posts:
my $posts = $fred->posts->search({}, { rows => 10, page => 1 });
while (my $post = $fred->next) {
print $post->title, " ", $post->post, "\n";
}
Resulting in:
SELECT me.id, me.realname, me.username, me.password, me.email, posts.id, posts.user_id, posts.created_date, posts.title, posts.post
FROM users me
LEFT JOIN posts ON me.id = posts.user_id
WHERE me.username = 'fred'
LIMIT 10
And for the page of posts with users:
my $posts_rs = $schema->resultset('Post')->search(
{},
{
prefetch => [ 'user' ],
rows => 10,
page => 1,
});
# Output all posts:
while (my $post = $posts_rs->next) {
print $post->user->username, " ", $post->title, " ", $post->post, "\n";
}
The SQL generated:
SELECT me.id, me.user_id, me.created_date, me.title, me.post, user.id, user.realname, user.username, user.password, user.email
FROM posts
JOIN user ON me.id = posts.user_id
LIMIT 10
As mentioned in Chapter 4, ResultSets are used to store the conditions to create database queries. This means that a ResultSet object can be passed around and used to collect conditions and attributes, before running the query.
Suppose we're building a list of unwanted words to search for in
realnames of users, and then later filter the search to only certain
email domains, we can add to the condition later by calling search
again on the returned ResultSet. This returns a new ResultSet
containing the merged conditions.
my $schema = MyBlog::Schema->connect("dbi:mysql:dbname=myblog", "myuser", "mypassword");
my $users_rs = $schema->resultset('User');
## Add initial conditions ...
my @badwords = ('john', 'joe', 'joseph');
my $badusers_rs = $users_rs->search({
realname => [ map { { 'like' => "%$_%"} } @badwords ],
});
## And later ...
my $filtered_badwords_rs = $badwords_rs->search({
email => { '-like' => '%@example.com' }
});
Our query will now be:
SELECT me.id, me.realname, me.username, me.password, me.email
FROM users
WHERE me.realname LIKE '%john%' OR me.realname LIKE '%joe%' OR me.realname LIKE '%joseph%' AND email LIKE '%@example.com';
To re-iterate, search
does not change the ResultSet it was called
on, just returns a new ResultSet object with the conditions merged
into the existing set.
Conditions and attributes are either merged or replaced according to these rules.
-
Search conditions - merged into existing using AND.
-
where
attribute - merged using AND. -
having
attribute - merged using AND. -
join
,prefetch
,+columns
,+select
,+as
are merged into existing attributes.
All other attributes replaced with the newly supplied values.
You can always call search
on an existing ResultSet to add or change
conditions or attributes, to extend the query.
Extending queries can sometimes have unexpected results, suppose ... # need an example for this!
my $schema = MyBlog::Schema->connect("dbi:mysql:dbname=myblog", "myuser", "mypassword");
my $users_rs = $schema->resultset('User');
$users_rs->search({ some conds })->as_subselect_rs->search({ more filtering });
The SQL language has a whole host of set operations to help manipulate results and make the database software do as much work as possible for you. You may of course prefer to do these in Perl, its up to you. DBIx::Class doesn't currently simplify the use of these directly, however there is a module on CPAN which supplies them.
To use the UNION
, INTERSECT
and EXCEPT
keywords, install the
DBIx::Class::Helpers module from CPAN.
Using these constructs we can glue together the results of arbitrary resultsets. For example, if we want to allow users to search for either usernames or post titles (or contents), we can UNION together several resultsets. Take care to select the same number of columns in the resultsets.
To add new methods to our ResultSet classes, we will need to create
our own ResultSet subclass. This will simply inherit from the existing
default DBIx::Class::ResultSet class, and add the Helper
component. Put this in the MyBlog/Schema/ResultSet/User.pm
file:
package MyBlog::Schema::ResultSet::User;
use strict;
use warnings;
use base 'DBIx::Class::ResultSet';
__PACKAGE__->load_components(qw/Helper::ResultSet::SetOperations/);
1;
Now we can use the Helper methods provided. First, create some resultsets to query each of the fields we wish to search on:
my $schema = MyBlog::Schema->connect("dbi:mysql:dbname=myblog", "myuser", "mypassword");
my $post_rs = $schema->resultset('Post');
my $users_rs = $schema->resultset('User');
my $title_rs = $post_rs->search(
{},
{
'columns' => ['id', { search => \'title as search'} ,{ tablename => \'"Post" as tablename'}],
}
);
my $content_rs = $post_rs->search(
{},
{
'columns' => ['id', { search => \'post as search'}, { tablename => \'"Post" as tablename'} ],
}
);
my $username_rs = $users_rs->search(
{},
{
'columns' => [ 'id', { search => \'username as search'} , { tablename => \'"User" as tablename'}],
}
);
my $realname_rs = $users_rs->search(
{},
{
'columns' => [ 'id', { search => \'realname as search'}, { tablename => \'"User" as tablename'}],
}
);
[%# may be able to remove this para at somepoint.. %] To ensure that
the resultset results will be consistant, the union
code checks that
the result_class
of its arguments are all the same. (See below) As
we're using different sources to union
we set the result_class here:
$title_rs->result_class('DBIx::Class::ResultClass::HashRefInflator');
$content_rs->result_class('DBIx::Class::ResultClass::HashRefInflator');
$username_rs->result_class('DBIx::Class::ResultClass::HashRefInflator');
$realname_rs->result_class('DBIx::Class::ResultClass::HashRefInflator');
Then create the union
and the actual search query:
my $search_term = 'fred';
my $datasearch_rs = $username_rs->union([$realname_rs, $title_rs, $content_rs])->search({
'search' => { '-like' => $search_term },
});
To actually get the results we need to introduce another new
technique, replacing the class that is used to create the Result
objects. In normal use this creates an instance of of your Result
class (which derives from DBIx::Class::Row). Using the Result class
for User
here will not work as the names of our columns don't match
the columns defined in the class. DBIx::Class installs one alternative
Result class generator for you,
DBIx::Class::ResultClass::HashRefInflator
, which causes the methods
next
, find
and all
to return a hashref of the results instead of
an object. We add it to the ResultSet using result_class
.
$datasearch_rs->result_class('DBIx::Class::ResultClass::HashRefInflator');
while( my $match = $datasearch_rs->next) {
## Enough data to create a link to the user/post of the match
print "Found: $match->{source}, $match->{id}, value: $match->{search}\n";
}
The SQL we get looks like this:
SELECT me.id, me.search, me.tablename
FROM (
SELECT me.id, username as search, "User" as tablename
FROM users me
UNION
SELECT me.id, realname as search, "User" as tablename
FROM users me
UNION
SELECT me.id, title as search, "Post" as tablename
FROM posts me
UNION
SELECT me.id, post as search, "Post" as tablename
FROM posts me) me
WHERE ( search LIKE ? )
Which is executed with the placeholders set to: ('fred')
intersect
and except
can be used in exactly the same way, and
produce respectively a set of results which exists in all the queries
given, and a set of results that contains all rows in the first query,
without any that appear in the subsequent queries.
This is an alternative way to get the same results we had with
group_by
and having
. First create a query to fetch all the users,
then use the except
method provided by DBIx::Class::Helpers to
subtract the users with no posts.
You can find this test in the file users_with_posts.t.
#!/usr/bin/env perl
use strict;
use warnings;
use Authen::Passphrase::SaltedDigest;
use Test::More;
use_ok('MyBlog::Schema');
package Test::ResultSet;
use strict;
use warnings;
use base 'DBIx::Class::Helper::ResultSet';
__PACKAGE__->mk_group_accessors('simple' => qw/method_calls/);
sub new {
my ($self, @args) = @_;
$self->method_calls({});
$self->next::method(@args);
}
## Count how many times except is called
sub except {
my ($self, @args) = @_;
$self->method_calls->{except}++;
$self->next::method(@args);
}
package main;
unlink 't/var/myblog.db';
my $schema = MyBlog::Schema->connect('dbi:SQLite:t/var/myblog.db');
$schema->deploy();
foreach my $source ($schema->sources) {
$schema->source($source)->resultset_class('Test::ResultSet');
}
my $users_rs = $schema->resultset('User');
## Add some initial data:
my @users = $users_rs->populate([
{
realname => 'Fred Bloggs',
username => 'fred',
password => Authen::Passphrase::SaltedDigest->new(algorithm => "SHA-1", salt_random => 20, passphrase=>'mypass'),
email => '[email protected]',
posts => [
{ title => 'Post 1', post => 'Post 1 content', created_date => DateTime->new(year => 2012, month => 01, day => 01) },
{ title => 'Post 2', post => 'Post 2 content', created_date => DateTime->new(year => 2012, month => 01, day => 03) },
],
},
{
realname => 'Joe Bloggs',
username => 'joe',
password => Authen::Passphrase::SaltedDigest->new(algorithm => "SHA-1", salt_random => 20, passphrase=>'sillypassword'),
email => '[email protected]',
},
{
realname => 'Jane Bloggs',
username => 'jane',
password => Authen::Passphrase::SaltedDigest->new(algorithm => "SHA-1", salt_random => 20, passphrase=>'sillypassword'),
email => '[email protected]',
posts => [
{ title => 'Post 3', post => 'Post 3 content', created_date => DateTime->new(year => 2012, month => 01, day => 05) },
{ title => 'Post 4', post => 'Post 4 content', created_date => DateTime->new(year => 2012, month => 01, day => 07) },
],
}
]);
### Users and their earliest posts
## Your code goes here:
my $users_with_posts_rs;
## Your code end
is($users_rs->method_calls->{except}, 1, 'Called "except" just once');
is($$users_with_posts_rs->count, 2, 'Found 2 users');
my $row = $$users_with_posts_rs->next;
ok($row->isa('MyBlog::Schema::Result::User'), 'Found user objects');
ok($row->username eq 'fred' || $row->username eq 'jane',
'Found users with posts, 1st user');
$row = $earliest_rs->next;
ok($row->username eq 'fred' || $row->username eq 'jane',
'Found users with posts, 2nd user');
done_testing;
SQL database systems provide two ways to store predefined queries in
the database. Views consist of a stored query, which can be used just
like a table in SELECT
statements. The underlying query is run when
needed to fetch the data.
As views are used similarly to tables, you can just create a normal
Result class, as described in ,
using the name of the view as the table
argument.
WARNING: Although this may make the view look like a table, it may not
allow you to run any of the methods to remove or alter the data, such
as update
and delete
. Whether these work will depend on the
structure of the query, and the database system you are using.
DBIx::Class also provides a specialised View
class which can store
the actual query behind the view, and thus allow it to be deployed
(output a CREATE VIEW statement), along with all the tables. Instead
of deploying the view or representing an actual view defined in the
database, it can also define a virtual view, a query just stored in
the DBIx::Class schema.
To create a View class to represent posts together with the username of the user that wrote them, we can create the PostsAndUser class:
1. package MyBlog::Schema::Result::PostsAndUser;
2. use strict;
3. use warnings;
4. use base qw/DBIx::Class::Core/;
5. __PACKAGE__->table_class('DBIx::Class::ResultSource::View');
6. __PACKAGE__->table('posts_and_user');
7. __PACKAGE__->result_source_instance->is_virtual(1);
9. __PACKAGE__->result_source_instance->view_definition(
10. "SELECT id, title, post, created_date, username, realname FROM posts JOIN users ON posts.user_id = users.id WHERE user.id = ?"
11. );
12. __PACKAGE__->add_columns(
13. 'id' => {
14. data_type => 'integer',
16. },
17. 'title' => {
18. data_type => 'varchar',
19. size => 255,
20. },
21. 'post => {
22. data_type => 'text',
23. },
24. 'created_date' => {
25. data_type => 'datetime',
26. },
27. 'username' => {
28. data_type => 'varchar',
29. size => 255,
30. },
31. 'realname' => {
32. data_type => 'varchar',
33. size => 255,
34. },
35. );
The main differences to the plain Table Result class are:
5. __PACKAGE__->table_class('DBIx::Class::ResultSource::View');
Line 5 sets the type of the Result class to be a View instead of the default Table class.
7. __PACKAGE__->result_source_instance->is_virtual(1);
Line 7 uses the is_virtual
method to define this as a DBIx::Class
level view, not a view in the database. Note that only virtual views
can contain placeholders.
9. __PACKAGE__->result_source_instance->view_definition(
10. "SELECT id, title, post, created_date, username, realname FROM posts JOIN users ON posts.user_id = users.id WHERE users.username = ?"
11. );
Lines 9 to 11 actually set the query used for the view.
We can use the view just like a normal query, providing the value for
the placeholder parameter in the bind
attribute:
my $schema = MyBlog::Schema->connect("dbi:mysql:dbname=myblog", "myuser", "mypassword");
my $posts_and_users_rs = $schema->resultset('PostsAndUser');
my $freds_posts = $posts_and_users_rs->search({}, { bind => ['fred'] });
This would run the SQL:
SELECT me.id, me.title, me.post, me.created_date, me.username, me.realname
FROM (
SELECT id, title, post, created_date, username, realname
FROM posts
JOIN users ON posts.user_id = users.id
WHERE
users.username = ?
)
Using the placeholders: ( 'fred' )
Stored procedures are SQL code stored in the database that can run quite complex programs to create, update and otherwise manipulate data. They can return either single values, or a set of rows.
DBIx::Class doesn't provide any specific support for stored procedures, however they may be used as the content of a virtual view (returning a set), or as a function (returning single values).
When accessing the database using multiple processes or clients, such as when writing a multi-user application or website, creates the possibility of race conditions. In our MyBlog schema, these can happen if, for example, two users attempt to create a new user with the same username, at the same time.
## First user:
my $user = $users_rs->find_or_new({
username => 'fredbloggs',
realname => 'Fred Bloggs',
password => 'something daft',
email => '[email protected]',
});
if(!$user->in_storage) { # New user
$user->insert;
} else {
warn "fredblogs already exists!"
}
## Second user:
my $user = $users_rs->find_or_new({
username => 'fredbloggs',
realname => 'Fred Bloggs',
password => 'summat daft',
email => '[email protected]',
});
if(!$user->in_storage) { # New user
$user->insert;
} else {
warn "fredblogs already exists!"
}
On the surface this might seem straight forward, but internally we are running this set of queries:
## First user:
SELECT id, username, realname, password, email
FROM users
WHERE username = 'fredbloggs';
-- Do some work
INSERT INTO users
(username, realname, password, email)
VALUES('fredbloggs', 'Fred Bloggs', 'something daft', '[email protected]');
## Second user:
SELECT id, username, realname, password, email
FROM users
WHERE username = 'fredbloggs';
-- Do some work
INSERT INTO users
(username, realname, password, email)
VALUES('fredbloggs', 'Fred Bloggs', 'summat daft', '[email protected]');
We obviously intend for these to run first user, then second user, and
for the second one to get an error that a user named 'fredbloggs'
already exists. What may happen is that the SELECT
statements run
first and both return a result of 'no such user yet'. To ensure the
statements for each user creation run together, we need to start a
transaction
, using the txn_do
method:
my $schema = MyBlog::Schema->connect("dbi:mysql:dbname=myblog", "myuser", "mypassword");
my $users_rs = $schema->resultset('User');
$schema->txn_do( sub {
my $user = $users_rs->find_or_new({
username => 'fredbloggs',
realname => 'Fred Bloggs',
password => 'something daft',
email => '[email protected]',
});
# ... Do some work
if(!$user->in_storage) { # New user
$user->insert;
} else {
warn "fredbloggs already exists!"
}
} );
Now we'll get the SQL:
START TRANSACTION;
SELECT id, username, realname, password, email
FROM users
WHERE username = 'fredbloggs';
-- Do some work
INSERT INTO users
(username, realname, password, email)
VALUES('fredbloggs', 'Fred Bloggs', 'something daft', '[email protected]');
COMMIT;
If any of these fail, the entire set of statements is automatically
reverted using the ROLLBACK
statement.
It's possible that you can't collect all the code you need to run in a
transaction in the same place in your code. In this case there are
also the bare bones methods txn_begin
and txn_commit
on the
Schema
object, which will independently start and end a
transaction. Another alternative is the txn_scope_guard
method which
will return a $guard
object. This will issue a rollback if it goes
out of scope. Otherwise it can be used to issue a commit
statement.
my $schema = MyBlog::Schema->connect("dbi:mysql:dbname=myblog", "myuser", "mypassword");
my $users_rs = $schema->resultset('User');
{
my $guard = $schema->txn_scope_guard;
my $user = $users_rs->find_or_new({
username => 'fredbloggs',
realname => 'Fred Bloggs',
password => 'something daft',
email => '[email protected]',
});
# ... Do some work, pass $user and $guard around ...
if(!$user->in_storage) { # New user
$user->insert;
} else {
warn "fredbloggs already exists!"
}
$guard->commit;
}
Transactions may be also be nested.