plainblack.com
Username Password
search
Bookmark and Share

    

SQL Questoion: User login frequency

User arjan
Date 7/28/2007 5:43 pm
Views 1063
Rating -7    Rate [
|
]
Previous · Next
User Message
arjan

Hi all,

Perhaps this is more a question for the mysql discussion board, but since it involves a lot of WebGUI tables - and there are here so many SQL Wizards - I perhaps have a better chance here. 

I'm trying to make a statistics page where I show how many times registered users have logged in. Users that are not in any other group but registered users (2), but are of course also in the group everyone (7). (In this case they can also be in a special group that I have now excluded from the query, but is a third "and groupId!=..." in the subquery.)

Now the frequency is double as high as it should be as you can see below: If I count the logins of a specific user they are half as high. Here is the result of the query, the check that I did and an explanation of the subquery:

mysql> select groupings.userId as UserId1, count(groupings.userId) as Frequency, SUBSTRING(users.username,1,10) as Username, max(timeStamp) as lastLogin from users, groupings, userLoginLog where (select count(userId) from groupings where userId=UserId1 and groupId!='2' and groupId!='7') = 0 and users.userId=groupings.userId and groupings.userId=userLoginLog.userId and userLoginLog.status='success' and users.status="Active" group by groupings.userId having count(groupings.userId) > 30 order by username;

+------------------------+-----------+------------+------------+
| UserId1                | Frequency | Username   | lastLogin  |
+------------------------+-----------+------------+------------+
| 7XaVQRUzHA9v3gXgKyGtCw |        46 | aaldrik    | 1185648564 |
| J2jqkU8I8_UJ7LZhpQSegA |        56 | Arie H     | 1185542312 |
| DwblMDbwEsKSyDxJDRDz4Q |        34 | joselim    | 1184753426 |
| G4_I0w_8_lssv8TTHIt26g |        38 | Kira       | 1185040871 |
| If9U9GJm3vZPyjgCL-Av4Q |        56 | Marcus van | 1185095889 |
| RTmZCIO2UYC_hb2SBMEe9A |        36 | Oud        | 1185279141 |
| 6gtSCJmQ5pZonYbXwZ4dTA |        34 | Ruth Giese | 1184224744 |
| gwhVqLfwSat3wpTrxYSPUw |        40 | tishara    | 1185650528 |
| _RjTv3fRddxOx-58taDwug |        52 | Vincent    | 1185645106 |
+------------------------+-----------+------------+------------+
9 rows in set (11.66 sec)

 

Now, if I check the first two results, I see they are twice as high:

 

mysql> select count(userId) from userLoginLog where userId="7XaVQRUzHA9v3gXgKyGtCw";

+---------------+| count(userId) |+---------------+|            23 | +---------------+
1 row in set (0.00 sec)

mysql> select count(userId) from userLoginLog where userId="J2jqkU8I8_UJ7LZhpQSegA";

+---------------+| count(userId) |+---------------+|            28 |
+---------------+
1 row in set (0.00 sec)

 

Here is my query better readable and with some explanation:

 

select
    groupings.userId                 as UserId1,
    count(groupings.userId)      as Frequency,
    SUBSTRING(users.username,1,10)    as Username,
    max(timeStamp)             as lastLogin
from
    users,
    groupings,
    userLoginLog
where
    (select
        count(userId)
    from
        groupings
    where
        userId=UserId1
    and
        groupId!='2'
    and
        groupId!='7') = 0

/* In the subquery: for every individual id in the groupings table a lookup will be done for an entry other than 2 (Registered Users) and 7 (Everyone). The result should be a list of people that are only a member of these two groups.  */


and
    users.userId=groupings.userId
and
    groupings.userId=userLoginLog.userId
and
    userLoginLog.status='success'
and
    users.status="Active"
group by
    groupings.userId
having
    count(groupings.userId) > 30
order
    by username;

--- (Edited on 29-July-2007 00:43 [GMT+0200] by arjan) ---



Back to Top
Rate [
|
]
 
 
henryykt

Hi Arjan,

Since users are in 2 two groups and you're joining with the groupings table in the main query, every user gets counted twice. My suggestion is to move the join with groupings into the subquery which determine if a user is in those groups. Something like this:

select users.userId as UserId1
,      count(users.userId) as Frequency
,      SUBSTRING(users.username,1,10) as Username
,      max(timeStamp) as lastLogin
from   users
,      userLoginLog
where users.userId in (
   select userId
   from   users
   where (select count(userId) from groupings where userId=UserId1 and groupId!='2' and groupId!='7') = 0
)
and    users.userId=userLoginLog.userId
and    userLoginLog.status='success'
and    users.status="Active"
group by users.userId having count(users.userId) > 30 order by username;

Regards,


Henry

--- (Edited on 7/29/2007 3:16 am [GMT-0500] by henryykt) ---



Back to Top
Rate [
|
]
 
 
arjan

Dear Henry,

Thanx for your reply. I think I get the point as to why I get the result twice. However I don't quite get a grasp on the solution. 

These are the steps in my reasoning:

1. We select every userId from users that has zero entries in groupings table other than where the groupId is 2 or 7:

mysql> select

    userId as UserId1

from

    users

where

    (select count(*) from groupings where userId=UserId1 and groupId !='2' and groupId!='7') = 0 ;

 

(.....) 

 zm6i5i0S2TSXOLp50sMnfQ |
| zvEVKQ7rG1n59ShOL-b9Ww |
+------------------------+
1391 rows in set (0.00 sec)

2. The result is a group of 1391 id's for which I want to know how often they have logged in. So I will later use this group as a constraint. 

3. What I want to know is the frequency of logins:

mysql> select

    users.userId as UserId1,

    count(userLoginLog.userId)

from

    users,

    userLoginLog

where

    users.userId=userLoginLog.userId

group by

    userLoginLog.userId; 

 

(...)

| zm6i5i0S2TSXOLp50sMnfQ |                          4 |
| zvEVKQ7rG1n59ShOL-b9Ww |                          3 |
+------------------------+----------------------------+
1413 rows in set (0.00 sec)

4. The result is a larger group of 1413 id's than the group I like to have, since it is not restricted to membership of groupId 2 or 7. And they definitely have Id's in common. For example the last two, that are shown. 

5. In general I can restrict the results using a 'where user.userId in ()' clause (before the grouping) or an 'having users.userId in ()' clause (after the grouping). Both ways seem to work. For example:

mysql> select users.userId as UserId1, count(userLoginLog.userId) from users, userLoginLog where users.userId=userLoginLog.userId and users.userId in ("zvEVKQ7rG1n59ShOL-b9Ww", "zm6i5i0S2TSXOLp50sMnfQ") group by userLoginLog.userId; 

+------------------------+----------------------------+
| UserId1                | count(userLoginLog.userId) |
+------------------------+----------------------------+
| zm6i5i0S2TSXOLp50sMnfQ |                          4 |
| zvEVKQ7rG1n59ShOL-b9Ww |                          3 |
+------------------------+----------------------------+
2 rows in set (0.00 sec)

or

mysql> select users.userId as UserId1, count(userLoginLog.userId) from users, userLoginLog where users.userId=userLoginLog.userId group by userLoginLog.userId having users.userId in ("zvEVKQ7rG1n59ShOL-b9Ww", "zm6i5i0S2TSXOLp50sMnfQ");

+------------------------+----------------------------+
| UserId1                | count(userLoginLog.userId) |
+------------------------+----------------------------+
| zm6i5i0S2TSXOLp50sMnfQ |                          4 |
| zvEVKQ7rG1n59ShOL-b9Ww |                          3 |
+------------------------+----------------------------+
2 rows in set (0.00 sec)

 

6. So I should be able to substitute the two example id's above by my select statement from 1). However both formulated as 'where in ()' or as 'having in ()' produces an empty set:

mysql> select users.userId as UserId1, count(userLoginLog.userId) from users, userLoginLog where users.userId=userLoginLog.userId and users.userId in (select userId as UserId1 from users where (select count(*) from groupings where userId=UserId1 and groupId !='2' and groupId!='7') = 0) group by userLoginLog.userId;
Empty set (0.00 sec)

mysql> select users.userId as UserId1, count(userLoginLog.userId) from users, userLoginLog where users.userId=userLoginLog.userId group by userLoginLog.userId having users.userId in (select userId as UserId1 from users where (select count(*) from groupings where userId=UserId1 and groupId !='2' and groupId!='7') = 0);
Empty set (0.00 sec)

7. The italic part is a substitution from naming two userId's by a select statement I know to produce more than a thousand Id's. 

And here a promising road seems to end...

Kind regards,

Arjan. 

 

--- (Edited on 29-July-2007 16:17 [GMT+0200] by arjan) ---



Back to Top
Rate [
|
]
 
 
henryykt

Hi Arjan,

I suspect that the problem is caused by the fact that in your queries in step 6, the alias UserId1 is defined twice.  After removing both aliases I got the correct rows from mysql.

Best regards,

Henry

 

--- (Edited on 7/30/2007 5:43 pm [GMT-0500] by henryykt) ---



Back to Top
Rate [
|
]
 
 
arjan

Hi,

I think I've got it, it can be done two ways:

1) select only those who have logged in and

2) selecting every user

Yung, advised me not to use the syntax 'from <table1>,<table2>' but to use 'from <table1> on <table.column>=<table.column>' to make things more explicit. That helped. And as a side-effect reduced the search time from more than 10 seconds to less than 1 second. Here it is:

 

ad 1) 

select

    users.userId as UserId1,

    count(userLoginLog.timeStamp) as Frequency,

    SUBSTRING(users.username,1,10) as Username,

    max(timeStamp) as lastLogin

from

    userLoginLog

left join

    users

on

    users.userId = userLoginLog.userId

where

    users.status="Active"

and

    userLoginLog.status="success"

group by

    users.userId

having

    (select

        count(*)

    from

        groupings

    where

        userId=UserId1

    and

        groupId !='2'

    and

        groupId!='7')=0

and

    users.userId!='3'

and

    users.userId!='1'

order

    by Frequency desc; 

ad 2)

select

    users.userId as UserId1,

    count(userLoginLog.timeStamp) as Frequency,

    SUBSTRING(users.username,1,10) as Username,

    max(timeStamp) as lastLogin

from

    users

left join

    userLoginLog

on

    users.userId = userLoginLog.userId

where

    users.status="Active"

and

    userLoginLog.status="success"

group by

    users.userId

having

    (select

        count(*)

    from

        groupings

    where

        userId=UserId1

    and

        groupId !='2'

    and

        groupId!='7')=0

and

    users.userId!='3'

and

    users.userId!='1'

order by

    Frequency desc; 

--- (Edited on 31-July-2007 23:01 [GMT+0200] by arjan: Forgot the quotes around users.userId!='1') ---

--- (Edited on 31-July-2007 23:23 [GMT+0200] by arjan) ---



Back to Top
Rate [
|
]
 
 
    



© 2012 Plain Black Corporation | All Rights Reserved