Issue on l10n with history

classic Classic list List threaded Threaded
12 messages Options
Reply | Threaded
Open this post in threaded view
|

Issue on l10n with history

Ludovic Dubost
Working on l10n change to custom mapping showed up an issue with the l10n
history.
We have a huge history table with 2,3M lines (as well as an activity
stream) and we use the history table for extracting contributor statistics.

However looking at it more closely shows that the source of most data in
the history table is useless.
See here a grouping by dates showing that localisation of already close to
2M lines in 6 month in 2012.

|                             201201 |   318346 |
|                             201202 |   311403 |
|                             201203 |   728703 |
|                             201204 |   271657 |
|                             201205 |   296384 |
|                             201206 |   120463 |

Here is an example of history for an entry

mysql> select
xwr_author,xwr_version1,xwr_version2,xwr_date,xwr_comment,cast(xwr_isdiff
as unsigned), length(xwr_patch) from xwikircs where xwr_docid=596494851  ;
+-----------------------+--------------+--------------+---------------------+-------------+------------------------------+-------------------+
| xwr_author            | xwr_version1 | xwr_version2 | xwr_date
 | xwr_comment | cast(xwr_isdiff as unsigned) | length(xwr_patch) |
+-----------------------+--------------+--------------+---------------------+-------------+------------------------------+-------------------+
| XWiki.XWikiTranslator |            1 |            1 | 2010-02-23 11:38:27
|             |                            1 |               163 |
| XWiki.XWikiTranslator |            2 |            1 | 2010-02-23 11:40:26
|             |                            1 |               330 |
| XWiki.rbuj            |            3 |            1 | 2010-03-04 00:24:24
|             |                            1 |               189 |
| XWiki.rbuj            |            4 |            1 | 2010-03-04 01:02:52
|             |                            1 |               223 |
| XWiki.rbuj            |            5 |            1 | 2010-07-30 01:12:58
|             |                            0 |              5026 |
| XWiki.XWikiTranslator |            6 |            1 | 2012-01-23 11:40:25
|             |                            1 |               115 |
| XWiki.XWikiTranslator |            7 |            1 | 2012-01-23 11:58:25
|             |                            1 |               115 |
| XWiki.XWikiTranslator |            8 |            1 | 2012-01-23 13:31:52
|             |                            1 |               234 |
| XWiki.XWikiTranslator |            8 |            2 | 2012-01-23 15:36:02
| Prepared    |                            1 |               115 |
| XWiki.XWikiTranslator |            8 |            3 | 2012-01-24 02:01:12
| Prepared    |                            0 |              5670 |
| XWiki.XWikiTranslator |            8 |            4 | 2012-01-25 02:04:00
| Prepared    |                            1 |               115 |
| XWiki.XWikiTranslator |            8 |            5 | 2012-01-26 02:02:29
| Prepared    |                            1 |               115 |
| XWiki.XWikiTranslator |            8 |            6 | 2012-01-27 02:01:18
| Prepared    |                            1 |               115 |
| XWiki.XWikiTranslator |            8 |            7 | 2012-02-04 02:01:08
| Prepared    |                            1 |               115 |
| XWiki.XWikiTranslator |            8 |            8 | 2012-02-07 02:02:03
| Prepared    |                            0 |              5670 |
| XWiki.XWikiTranslator |            8 |            9 | 2012-02-08 02:01:20
| Prepared    |                            1 |               115 |
| XWiki.XWikiTranslator |            8 |           10 | 2012-02-14 02:01:07
| Prepared    |                            1 |               116 |
| XWiki.XWikiTranslator |            8 |           11 | 2012-02-28 02:01:32
| Prepared    |                            1 |               116 |
| XWiki.XWikiTranslator |            8 |           12 | 2012-03-03 02:01:20
| Prepared    |                            1 |               116 |
| XWiki.XWikiTranslator |            8 |           13 | 2012-03-14 02:01:50
| Prepared    |                            0 |              5671 |
| XWiki.XWikiTranslator |            8 |           14 | 2012-03-14 18:23:06
| Prepared    |                            1 |               116 |
| XWiki.XWikiTranslator |            8 |           15 | 2012-03-14 18:32:38
| Prepared    |                            1 |               116 |
| XWiki.XWikiTranslator |            8 |           16 | 2012-03-14 18:37:17
| Prepared    |                            1 |               116 |
| XWiki.XWikiTranslator |            8 |           17 | 2012-03-24 02:01:34
| Prepared    |                            1 |               116 |
| XWiki.XWikiTranslator |            8 |           18 | 2012-03-26 16:40:35
| Prepared    |                            0 |              5671 |
| XWiki.XWikiTranslator |            8 |           19 | 2012-03-28 02:01:23
| Prepared    |                            1 |               116 |
| XWiki.XWikiTranslator |            8 |           20 | 2012-03-29 02:01:45
| Prepared    |                            1 |               116 |
| XWiki.XWikiTranslator |            8 |           21 | 2012-04-11 02:01:36
| Prepared    |                            1 |               116 |
| XWiki.XWikiTranslator |            8 |           22 | 2012-04-19 15:20:10
| Prepared    |                            1 |               116 |
| XWiki.XWikiTranslator |            8 |           24 | 2012-04-29 02:01:45
| Prepared    |                            1 |               116 |
| XWiki.XWikiTranslator |            8 |           25 | 2012-05-04 18:55:30
| Prepared    |                            1 |               116 |
| XWiki.XWikiTranslator |            8 |           26 | 2012-05-04 19:02:09
| Prepared    |                            1 |               116 |
| XWiki.XWikiTranslator |            8 |           27 | 2012-05-10 02:04:36
| Prepared    |                            1 |               116 |
| XWiki.XWikiTranslator |            8 |           28 | 2012-05-30 02:02:08
| Prepared    |                            0 |              5671 |
| XWiki.XWikiTranslator |            8 |           29 | 2012-06-07 10:44:28
| Prepared    |                            0 |              5782 |
+-----------------------+--------------+--------------+---------------------+-------------+------------------------------+-------------------+

http://l10n.xwiki.org/xwiki/bin/view/XE/XEXWikiCoreResources_2107067965_core-menu-watchlist-add-page_nl?viewer=history&showminor=true

Most entries are "no changes" and have been cause by the L10NUpdater which
wrongefully saved the document with no changes. I believe this must have
been fixed (by Thomas M.?)  mid 2012.

Now the 2M lines impact performance significantly and loads the DB for
nothing (and in the activity stream as well).

I suggest we clean up the history and activity stream. We have 2
possibilities:

For xwikircs:

1/ Clean up only the bad data from XWikiTranslator when there are no
changes:

This is complicated as you need to verify if the change is actually a
change and you cannot do that just with sql queries. It could be very long

2/ Clean up old data from pre-201206 from XWikiTranslator

Simpler if it is safe to delete by date in the DB. After discussion with
sergui this is a bit complicated because you need to make sure you don't
delete the latest full version before the versions you keep. So you would
have to do it by API which will take ages.

3/ Clean up up old data from pre-201206 from all users

This is simpler as you can safely delete from the database everything older
than a certain versions. Cleans-up even more but would loose contributor
statistics unless we store 2012 contributor counts in an alternate table
which would then be regularly updated

In any case we should probably create this intermediary table for
statistics as it would be much faster anyway.

For activitystream:

1/ Clean up old data from XWikiTranslator 201206 or earlier

2/ Clean up old date from everybody 201206 or earlier

What value do we see in the l10n history and actvity stream and which
solutions do the other commiters suggest ?

I would say it's interesting for contributor statistics (counting number of
contribution by translators) but beyond that we can delete the data.
So we would fix that by storing monthly statistics in a table and updating
the latest 2 month through a scheduler job. This means that we can also
delete history over 2 month.

WDYT ?

Ludovic

--
Ludovic Dubost
Founder and CEO
Blog: http://blog.ludovic.org/
XWiki: http://www.xwiki.com
Skype: ldubost GTalk: ldubost
_______________________________________________
devs mailing list
[hidden email]
http://lists.xwiki.org/mailman/listinfo/devs
Reply | Threaded
Open this post in threaded view
|

l10n : translation history

xrichard
Hello,

Is there a way to view history for a particular  translation key on l10n
translation Wiki?
I mean knowing:
- when the key has been introduced in XWiki?
- what are the differents values among time?
- what users have participated in the translation?

Because Ludovic his speaking of translation history (see below) but I don't see
the tradictionnal history tab on translations.

regards,

Le 24/01/2013 06:03, Ludovic Dubost a écrit :
> Working on l10n change to custom mapping showed up an issue with the l10n
> history.
> We have a huge history table with 2,3M lines (as well as an activity
> stream) and we use the history table for extracting contributor statistics.
> [...]
>
> Ludovic

_______________________________________________
devs mailing list
[hidden email]
http://lists.xwiki.org/mailman/listinfo/devs
Reply | Threaded
Open this post in threaded view
|

Re: Issue on l10n with history

Sergiu Dumitriu-2
In reply to this post by Ludovic Dubost
On 01/24/2013 12:03 AM, Ludovic Dubost wrote:

> Working on l10n change to custom mapping showed up an issue with the l10n
> history.
> We have a huge history table with 2,3M lines (as well as an activity
> stream) and we use the history table for extracting contributor statistics.
>
> However looking at it more closely shows that the source of most data in
> the history table is useless.
> See here a grouping by dates showing that localisation of already close to
> 2M lines in 6 month in 2012.
>
> |                             201201 |   318346 |
> |                             201202 |   311403 |
> |                             201203 |   728703 |
> |                             201204 |   271657 |
> |                             201205 |   296384 |
> |                             201206 |   120463 |
>
> Here is an example of history for an entry
>
> mysql> select
> xwr_author,xwr_version1,xwr_version2,xwr_date,xwr_comment,cast(xwr_isdiff
> as unsigned), length(xwr_patch) from xwikircs where xwr_docid=596494851  ;
> +-----------------------+--------------+--------------+---------------------+-------------+------------------------------+-------------------+
> | xwr_author            | xwr_version1 | xwr_version2 | xwr_date
>  | xwr_comment | cast(xwr_isdiff as unsigned) | length(xwr_patch) |
> +-----------------------+--------------+--------------+---------------------+-------------+------------------------------+-------------------+
> | XWiki.XWikiTranslator |            1 |            1 | 2010-02-23 11:38:27
> |             |                            1 |               163 |
> | XWiki.XWikiTranslator |            2 |            1 | 2010-02-23 11:40:26
> |             |                            1 |               330 |
> | XWiki.rbuj            |            3 |            1 | 2010-03-04 00:24:24
> |             |                            1 |               189 |
> | XWiki.rbuj            |            4 |            1 | 2010-03-04 01:02:52
> |             |                            1 |               223 |
> | XWiki.rbuj            |            5 |            1 | 2010-07-30 01:12:58
> |             |                            0 |              5026 |
> | XWiki.XWikiTranslator |            6 |            1 | 2012-01-23 11:40:25
> |             |                            1 |               115 |
> | XWiki.XWikiTranslator |            7 |            1 | 2012-01-23 11:58:25
> |             |                            1 |               115 |
> | XWiki.XWikiTranslator |            8 |            1 | 2012-01-23 13:31:52
> |             |                            1 |               234 |
> | XWiki.XWikiTranslator |            8 |            2 | 2012-01-23 15:36:02
> | Prepared    |                            1 |               115 |
> | XWiki.XWikiTranslator |            8 |            3 | 2012-01-24 02:01:12
> | Prepared    |                            0 |              5670 |
> | XWiki.XWikiTranslator |            8 |            4 | 2012-01-25 02:04:00
> | Prepared    |                            1 |               115 |
> | XWiki.XWikiTranslator |            8 |            5 | 2012-01-26 02:02:29
> | Prepared    |                            1 |               115 |
> | XWiki.XWikiTranslator |            8 |            6 | 2012-01-27 02:01:18
> | Prepared    |                            1 |               115 |
> | XWiki.XWikiTranslator |            8 |            7 | 2012-02-04 02:01:08
> | Prepared    |                            1 |               115 |
> | XWiki.XWikiTranslator |            8 |            8 | 2012-02-07 02:02:03
> | Prepared    |                            0 |              5670 |
> | XWiki.XWikiTranslator |            8 |            9 | 2012-02-08 02:01:20
> | Prepared    |                            1 |               115 |
> | XWiki.XWikiTranslator |            8 |           10 | 2012-02-14 02:01:07
> | Prepared    |                            1 |               116 |
> | XWiki.XWikiTranslator |            8 |           11 | 2012-02-28 02:01:32
> | Prepared    |                            1 |               116 |
> | XWiki.XWikiTranslator |            8 |           12 | 2012-03-03 02:01:20
> | Prepared    |                            1 |               116 |
> | XWiki.XWikiTranslator |            8 |           13 | 2012-03-14 02:01:50
> | Prepared    |                            0 |              5671 |
> | XWiki.XWikiTranslator |            8 |           14 | 2012-03-14 18:23:06
> | Prepared    |                            1 |               116 |
> | XWiki.XWikiTranslator |            8 |           15 | 2012-03-14 18:32:38
> | Prepared    |                            1 |               116 |
> | XWiki.XWikiTranslator |            8 |           16 | 2012-03-14 18:37:17
> | Prepared    |                            1 |               116 |
> | XWiki.XWikiTranslator |            8 |           17 | 2012-03-24 02:01:34
> | Prepared    |                            1 |               116 |
> | XWiki.XWikiTranslator |            8 |           18 | 2012-03-26 16:40:35
> | Prepared    |                            0 |              5671 |
> | XWiki.XWikiTranslator |            8 |           19 | 2012-03-28 02:01:23
> | Prepared    |                            1 |               116 |
> | XWiki.XWikiTranslator |            8 |           20 | 2012-03-29 02:01:45
> | Prepared    |                            1 |               116 |
> | XWiki.XWikiTranslator |            8 |           21 | 2012-04-11 02:01:36
> | Prepared    |                            1 |               116 |
> | XWiki.XWikiTranslator |            8 |           22 | 2012-04-19 15:20:10
> | Prepared    |                            1 |               116 |
> | XWiki.XWikiTranslator |            8 |           24 | 2012-04-29 02:01:45
> | Prepared    |                            1 |               116 |
> | XWiki.XWikiTranslator |            8 |           25 | 2012-05-04 18:55:30
> | Prepared    |                            1 |               116 |
> | XWiki.XWikiTranslator |            8 |           26 | 2012-05-04 19:02:09
> | Prepared    |                            1 |               116 |
> | XWiki.XWikiTranslator |            8 |           27 | 2012-05-10 02:04:36
> | Prepared    |                            1 |               116 |
> | XWiki.XWikiTranslator |            8 |           28 | 2012-05-30 02:02:08
> | Prepared    |                            0 |              5671 |
> | XWiki.XWikiTranslator |            8 |           29 | 2012-06-07 10:44:28
> | Prepared    |                            0 |              5782 |
> +-----------------------+--------------+--------------+---------------------+-------------+------------------------------+-------------------+
>
> http://l10n.xwiki.org/xwiki/bin/view/XE/XEXWikiCoreResources_2107067965_core-menu-watchlist-add-page_nl?viewer=history&showminor=true
>
> Most entries are "no changes" and have been cause by the L10NUpdater which
> wrongefully saved the document with no changes. I believe this must have
> been fixed (by Thomas M.?)  mid 2012.
>
> Now the 2M lines impact performance significantly and loads the DB for
> nothing (and in the activity stream as well).
>
> I suggest we clean up the history and activity stream. We have 2
> possibilities:
>
> For xwikircs:
>
> 1/ Clean up only the bad data from XWikiTranslator when there are no
> changes:
>
> This is complicated as you need to verify if the change is actually a
> change and you cannot do that just with sql queries. It could be very long
>
> 2/ Clean up old data from pre-201206 from XWikiTranslator
>
> Simpler if it is safe to delete by date in the DB. After discussion with
> sergui this is a bit complicated because you need to make sure you don't
> delete the latest full version before the versions you keep. So you would
> have to do it by API which will take ages.
>
> 3/ Clean up up old data from pre-201206 from all users
>
> This is simpler as you can safely delete from the database everything older
> than a certain versions. Cleans-up even more but would loose contributor
> statistics unless we store 2012 contributor counts in an alternate table
> which would then be regularly updated

+1 for 3/, without the extra stats table yet. See below.

Technically, we could also use /2, since we'll still have history
summary (who changed when), but some actual versions won't be
retrievable. Is the actual revision important? IMHO, only if we want to
investigate some foul play.

> In any case we should probably create this intermediary table for
> statistics as it would be much faster anyway.
>
> For activitystream:
>
> 1/ Clean up old data from XWikiTranslator 201206 or earlier
>
> 2/ Clean up old date from everybody 201206 or earlier

+1 for 1/

> What value do we see in the l10n history and actvity stream and which
> solutions do the other commiters suggest ?
>
> I would say it's interesting for contributor statistics (counting number of
> contribution by translators) but beyond that we can delete the data.
> So we would fix that by storing monthly statistics in a table and updating
> the latest 2 month through a scheduler job. This means that we can also
> delete history over 2 month.

We can get contributor activity from the activity stream, independently
from the RCS table. Since the activity stream doesn't have inter-version
dependencies like the RCS does, we can freely discard irrelevant rows
from it, and use the remaining valid data. The same information is
present in the activity stream and in the RCS table: who did what, when,
and where.

I mean, this is a good fast solution for the moment. We could still make
a custom statistics data structure for translation contributors, but
that would take longer, and thus it delays the migration of the l10n
wiki to a newer version, and delays the performance improvement that
we'd gain by dropping data.

--
Sergiu Dumitriu
http://purl.org/net/sergiu
_______________________________________________
devs mailing list
[hidden email]
http://lists.xwiki.org/mailman/listinfo/devs
Reply | Threaded
Open this post in threaded view
|

Re: Issue on l10n with history

Thomas Mortagne
Administrator
On Thu, Jan 24, 2013 at 8:15 AM, Sergiu Dumitriu <[hidden email]> wrote:

> On 01/24/2013 12:03 AM, Ludovic Dubost wrote:
>> Working on l10n change to custom mapping showed up an issue with the l10n
>> history.
>> We have a huge history table with 2,3M lines (as well as an activity
>> stream) and we use the history table for extracting contributor statistics.
>>
>> However looking at it more closely shows that the source of most data in
>> the history table is useless.
>> See here a grouping by dates showing that localisation of already close to
>> 2M lines in 6 month in 2012.
>>
>> |                             201201 |   318346 |
>> |                             201202 |   311403 |
>> |                             201203 |   728703 |
>> |                             201204 |   271657 |
>> |                             201205 |   296384 |
>> |                             201206 |   120463 |
>>
>> Here is an example of history for an entry
>>
>> mysql> select
>> xwr_author,xwr_version1,xwr_version2,xwr_date,xwr_comment,cast(xwr_isdiff
>> as unsigned), length(xwr_patch) from xwikircs where xwr_docid=596494851  ;
>> +-----------------------+--------------+--------------+---------------------+-------------+------------------------------+-------------------+
>> | xwr_author            | xwr_version1 | xwr_version2 | xwr_date
>>  | xwr_comment | cast(xwr_isdiff as unsigned) | length(xwr_patch) |
>> +-----------------------+--------------+--------------+---------------------+-------------+------------------------------+-------------------+
>> | XWiki.XWikiTranslator |            1 |            1 | 2010-02-23 11:38:27
>> |             |                            1 |               163 |
>> | XWiki.XWikiTranslator |            2 |            1 | 2010-02-23 11:40:26
>> |             |                            1 |               330 |
>> | XWiki.rbuj            |            3 |            1 | 2010-03-04 00:24:24
>> |             |                            1 |               189 |
>> | XWiki.rbuj            |            4 |            1 | 2010-03-04 01:02:52
>> |             |                            1 |               223 |
>> | XWiki.rbuj            |            5 |            1 | 2010-07-30 01:12:58
>> |             |                            0 |              5026 |
>> | XWiki.XWikiTranslator |            6 |            1 | 2012-01-23 11:40:25
>> |             |                            1 |               115 |
>> | XWiki.XWikiTranslator |            7 |            1 | 2012-01-23 11:58:25
>> |             |                            1 |               115 |
>> | XWiki.XWikiTranslator |            8 |            1 | 2012-01-23 13:31:52
>> |             |                            1 |               234 |
>> | XWiki.XWikiTranslator |            8 |            2 | 2012-01-23 15:36:02
>> | Prepared    |                            1 |               115 |
>> | XWiki.XWikiTranslator |            8 |            3 | 2012-01-24 02:01:12
>> | Prepared    |                            0 |              5670 |
>> | XWiki.XWikiTranslator |            8 |            4 | 2012-01-25 02:04:00
>> | Prepared    |                            1 |               115 |
>> | XWiki.XWikiTranslator |            8 |            5 | 2012-01-26 02:02:29
>> | Prepared    |                            1 |               115 |
>> | XWiki.XWikiTranslator |            8 |            6 | 2012-01-27 02:01:18
>> | Prepared    |                            1 |               115 |
>> | XWiki.XWikiTranslator |            8 |            7 | 2012-02-04 02:01:08
>> | Prepared    |                            1 |               115 |
>> | XWiki.XWikiTranslator |            8 |            8 | 2012-02-07 02:02:03
>> | Prepared    |                            0 |              5670 |
>> | XWiki.XWikiTranslator |            8 |            9 | 2012-02-08 02:01:20
>> | Prepared    |                            1 |               115 |
>> | XWiki.XWikiTranslator |            8 |           10 | 2012-02-14 02:01:07
>> | Prepared    |                            1 |               116 |
>> | XWiki.XWikiTranslator |            8 |           11 | 2012-02-28 02:01:32
>> | Prepared    |                            1 |               116 |
>> | XWiki.XWikiTranslator |            8 |           12 | 2012-03-03 02:01:20
>> | Prepared    |                            1 |               116 |
>> | XWiki.XWikiTranslator |            8 |           13 | 2012-03-14 02:01:50
>> | Prepared    |                            0 |              5671 |
>> | XWiki.XWikiTranslator |            8 |           14 | 2012-03-14 18:23:06
>> | Prepared    |                            1 |               116 |
>> | XWiki.XWikiTranslator |            8 |           15 | 2012-03-14 18:32:38
>> | Prepared    |                            1 |               116 |
>> | XWiki.XWikiTranslator |            8 |           16 | 2012-03-14 18:37:17
>> | Prepared    |                            1 |               116 |
>> | XWiki.XWikiTranslator |            8 |           17 | 2012-03-24 02:01:34
>> | Prepared    |                            1 |               116 |
>> | XWiki.XWikiTranslator |            8 |           18 | 2012-03-26 16:40:35
>> | Prepared    |                            0 |              5671 |
>> | XWiki.XWikiTranslator |            8 |           19 | 2012-03-28 02:01:23
>> | Prepared    |                            1 |               116 |
>> | XWiki.XWikiTranslator |            8 |           20 | 2012-03-29 02:01:45
>> | Prepared    |                            1 |               116 |
>> | XWiki.XWikiTranslator |            8 |           21 | 2012-04-11 02:01:36
>> | Prepared    |                            1 |               116 |
>> | XWiki.XWikiTranslator |            8 |           22 | 2012-04-19 15:20:10
>> | Prepared    |                            1 |               116 |
>> | XWiki.XWikiTranslator |            8 |           24 | 2012-04-29 02:01:45
>> | Prepared    |                            1 |               116 |
>> | XWiki.XWikiTranslator |            8 |           25 | 2012-05-04 18:55:30
>> | Prepared    |                            1 |               116 |
>> | XWiki.XWikiTranslator |            8 |           26 | 2012-05-04 19:02:09
>> | Prepared    |                            1 |               116 |
>> | XWiki.XWikiTranslator |            8 |           27 | 2012-05-10 02:04:36
>> | Prepared    |                            1 |               116 |
>> | XWiki.XWikiTranslator |            8 |           28 | 2012-05-30 02:02:08
>> | Prepared    |                            0 |              5671 |
>> | XWiki.XWikiTranslator |            8 |           29 | 2012-06-07 10:44:28
>> | Prepared    |                            0 |              5782 |
>> +-----------------------+--------------+--------------+---------------------+-------------+------------------------------+-------------------+
>>
>> http://l10n.xwiki.org/xwiki/bin/view/XE/XEXWikiCoreResources_2107067965_core-menu-watchlist-add-page_nl?viewer=history&showminor=true
>>
>> Most entries are "no changes" and have been cause by the L10NUpdater which
>> wrongefully saved the document with no changes. I believe this must have
>> been fixed (by Thomas M.?)  mid 2012.

Yep, save is supposed to be done only when necessary now.

>>
>> Now the 2M lines impact performance significantly and loads the DB for
>> nothing (and in the activity stream as well).
>>
>> I suggest we clean up the history and activity stream. We have 2
>> possibilities:
>>
>> For xwikircs:
>>
>> 1/ Clean up only the bad data from XWikiTranslator when there are no
>> changes:
>>
>> This is complicated as you need to verify if the change is actually a
>> change and you cannot do that just with sql queries. It could be very long
>>
>> 2/ Clean up old data from pre-201206 from XWikiTranslator
>>
>> Simpler if it is safe to delete by date in the DB. After discussion with
>> sergui this is a bit complicated because you need to make sure you don't
>> delete the latest full version before the versions you keep. So you would
>> have to do it by API which will take ages.
>>
>> 3/ Clean up up old data from pre-201206 from all users
>>
>> This is simpler as you can safely delete from the database everything older
>> than a certain versions. Cleans-up even more but would loose contributor
>> statistics unless we store 2012 contributor counts in an alternate table
>> which would then be regularly updated
>
> +1 for 3/, without the extra stats table yet. See below.
>
> Technically, we could also use /2, since we'll still have history
> summary (who changed when), but some actual versions won't be
> retrievable. Is the actual revision important? IMHO, only if we want to
> investigate some foul play.
>
>> In any case we should probably create this intermediary table for
>> statistics as it would be much faster anyway.
>>
>> For activitystream:
>>
>> 1/ Clean up old data from XWikiTranslator 201206 or earlier
>>
>> 2/ Clean up old date from everybody 201206 or earlier
>
> +1 for 1/

+1 for 1/ plus some very old history associated to ludovic user which
is kind of the ancestor of XWikiTranslator.

>
>> What value do we see in the l10n history and actvity stream and which
>> solutions do the other commiters suggest ?
>>
>> I would say it's interesting for contributor statistics (counting number of
>> contribution by translators) but beyond that we can delete the data.
>> So we would fix that by storing monthly statistics in a table and updating
>> the latest 2 month through a scheduler job. This means that we can also
>> delete history over 2 month.
>
> We can get contributor activity from the activity stream, independently
> from the RCS table. Since the activity stream doesn't have inter-version
> dependencies like the RCS does, we can freely discard irrelevant rows
> from it, and use the remaining valid data. The same information is
> present in the activity stream and in the RCS table: who did what, when,
> and where.
>
> I mean, this is a good fast solution for the moment. We could still make
> a custom statistics data structure for translation contributors, but
> that would take longer, and thus it delays the migration of the l10n
> wiki to a newer version, and delays the performance improvement that
> we'd gain by dropping data.
>
> --
> Sergiu Dumitriu
> http://purl.org/net/sergiu
> _______________________________________________
> devs mailing list
> [hidden email]
> http://lists.xwiki.org/mailman/listinfo/devs



--
Thomas Mortagne
_______________________________________________
devs mailing list
[hidden email]
http://lists.xwiki.org/mailman/listinfo/devs
Reply | Threaded
Open this post in threaded view
|

Re: l10n : translation history

Thomas Mortagne
Administrator
In reply to this post by xrichard
Here is an example of where a translation is actually saved:
http://l10n.xwiki.org/xwiki/bin/view/XE/XEXWikiCoreResources_1033928161_platform-extension-distributionWizard-stepHeading_fr

You should see who changed it when and when XWiki Translator introduced it.

On Thu, Jan 24, 2013 at 7:45 AM, Ldm Public <[hidden email]> wrote:

> Hello,
>
> Is there a way to view history for a particular  translation key on l10n
> translation Wiki?
> I mean knowing:
> - when the key has been introduced in XWiki?
> - what are the differents values among time?
> - what users have participated in the translation?
>
> Because Ludovic his speaking of translation history (see below) but I don't
> see the tradictionnal history tab on translations.
>
> regards,
>
> Le 24/01/2013 06:03, Ludovic Dubost a écrit :
>>
>> Working on l10n change to custom mapping showed up an issue with the l10n
>> history.
>> We have a huge history table with 2,3M lines (as well as an activity
>> stream) and we use the history table for extracting contributor
>> statistics.
>> [...]
>>
>> Ludovic
>
>
> _______________________________________________
> devs mailing list
> [hidden email]
> http://lists.xwiki.org/mailman/listinfo/devs



--
Thomas Mortagne
_______________________________________________
devs mailing list
[hidden email]
http://lists.xwiki.org/mailman/listinfo/devs
Reply | Threaded
Open this post in threaded view
|

Re: l10n : translation history

Thomas Mortagne
Administrator
You can find the real location of a translation using
l10n.xwiki.org/xwiki/bin/view/L10N/Search

On Thu, Jan 24, 2013 at 9:23 AM, Thomas Mortagne
<[hidden email]> wrote:

> Here is an example of where a translation is actually saved:
> http://l10n.xwiki.org/xwiki/bin/view/XE/XEXWikiCoreResources_1033928161_platform-extension-distributionWizard-stepHeading_fr
>
> You should see who changed it when and when XWiki Translator introduced it.
>
> On Thu, Jan 24, 2013 at 7:45 AM, Ldm Public <[hidden email]> wrote:
>> Hello,
>>
>> Is there a way to view history for a particular  translation key on l10n
>> translation Wiki?
>> I mean knowing:
>> - when the key has been introduced in XWiki?
>> - what are the differents values among time?
>> - what users have participated in the translation?
>>
>> Because Ludovic his speaking of translation history (see below) but I don't
>> see the tradictionnal history tab on translations.
>>
>> regards,
>>
>> Le 24/01/2013 06:03, Ludovic Dubost a écrit :
>>>
>>> Working on l10n change to custom mapping showed up an issue with the l10n
>>> history.
>>> We have a huge history table with 2,3M lines (as well as an activity
>>> stream) and we use the history table for extracting contributor
>>> statistics.
>>> [...]
>>>
>>> Ludovic
>>
>>
>> _______________________________________________
>> devs mailing list
>> [hidden email]
>> http://lists.xwiki.org/mailman/listinfo/devs
>
>
>
> --
> Thomas Mortagne



--
Thomas Mortagne
_______________________________________________
devs mailing list
[hidden email]
http://lists.xwiki.org/mailman/listinfo/devs
Reply | Threaded
Open this post in threaded view
|

Re: l10n : translation history

xrichard
Thanks Thomas, I have found at the top of each "form translation page" the
link to the real translation page where full history is available.

Thanks again, it's hard to be a newbie ;-).

2013/1/24 Thomas Mortagne <[hidden email]>

> You can find the real location of a translation using
> l10n.xwiki.org/xwiki/bin/view/L10N/Search
>
> On Thu, Jan 24, 2013 at 9:23 AM, Thomas Mortagne
> <[hidden email]> wrote:
> > Here is an example of where a translation is actually saved:
> >
> http://l10n.xwiki.org/xwiki/bin/view/XE/XEXWikiCoreResources_1033928161_platform-extension-distributionWizard-stepHeading_fr
> >
> > You should see who changed it when and when XWiki Translator introduced
> it.
> >
> > On Thu, Jan 24, 2013 at 7:45 AM, Ldm Public <[hidden email]> wrote:
> >> Hello,
> >>
> >> Is there a way to view history for a particular  translation key on l10n
> >> translation Wiki?
> >> I mean knowing:
> >> - when the key has been introduced in XWiki?
> >> - what are the differents values among time?
> >> - what users have participated in the translation?
> >>
> >> Because Ludovic his speaking of translation history (see below) but I
> don't
> >> see the tradictionnal history tab on translations.
> >>
> >> regards,
> >>
> >> Le 24/01/2013 06:03, Ludovic Dubost a écrit :
> >>>
> >>> Working on l10n change to custom mapping showed up an issue with the
> l10n
> >>> history.
> >>> We have a huge history table with 2,3M lines (as well as an activity
> >>> stream) and we use the history table for extracting contributor
> >>> statistics.
> >>> [...]
> >>>
> >>> Ludovic
> >>
> >>
> >> _______________________________________________
> >> devs mailing list
> >> [hidden email]
> >> http://lists.xwiki.org/mailman/listinfo/devs
> >
> >
> >
> > --
> > Thomas Mortagne
>
>
>
> --
> Thomas Mortagne
> _______________________________________________
> devs mailing list
> [hidden email]
> http://lists.xwiki.org/mailman/listinfo/devs
>
_______________________________________________
devs mailing list
[hidden email]
http://lists.xwiki.org/mailman/listinfo/devs
Reply | Threaded
Open this post in threaded view
|

Re: Issue on l10n with history

Ludovic Dubost
In reply to this post by Sergiu Dumitriu-2
The custom statistics data structure is not that complicated.
I've already the SQL query that does the initial version (based on xwikircs
for now)

  create table l10n_stats_author as select year(xwr_date)*100+
month(xwr_date) as l10n_month, XWR_AUTHOR as l10n_author, count(distinct
XWR_DOCID, XWR_VERSION1, XWR_VERSION2) as l10n_count from xwikircs group by
1,2;
  delete from l10n_stats_author where l10n_month=201301 or
l10n_month=201302;
  insert into l10n_stats_author (select year(xwr_date)*100+ month(xwr_date)
as l10n_month, XWR_AUTHOR as l10n_author, count(distinct XWR_DOCID,
XWR_VERSION1, XWR_VERSION2) as l10n_count from xwikircs where xwr_date >=
'2012-12-01' group by 1,2);
  select * from l10n_stats_author where l10n_month=201301 or
l10n_month=201302;


Putting this in a scheduler job should not be complicated

Ludovic



2013/1/24 Sergiu Dumitriu <[hidden email]>

> On 01/24/2013 12:03 AM, Ludovic Dubost wrote:
> > Working on l10n change to custom mapping showed up an issue with the l10n
> > history.
> > We have a huge history table with 2,3M lines (as well as an activity
> > stream) and we use the history table for extracting contributor
> statistics.
> >
> > However looking at it more closely shows that the source of most data in
> > the history table is useless.
> > See here a grouping by dates showing that localisation of already close
> to
> > 2M lines in 6 month in 2012.
> >
> > |                             201201 |   318346 |
> > |                             201202 |   311403 |
> > |                             201203 |   728703 |
> > |                             201204 |   271657 |
> > |                             201205 |   296384 |
> > |                             201206 |   120463 |
> >
> > Here is an example of history for an entry
> >
> > mysql> select
> > xwr_author,xwr_version1,xwr_version2,xwr_date,xwr_comment,cast(xwr_isdiff
> > as unsigned), length(xwr_patch) from xwikircs where xwr_docid=596494851
>  ;
> >
> +-----------------------+--------------+--------------+---------------------+-------------+------------------------------+-------------------+
> > | xwr_author            | xwr_version1 | xwr_version2 | xwr_date
> >  | xwr_comment | cast(xwr_isdiff as unsigned) | length(xwr_patch) |
> >
> +-----------------------+--------------+--------------+---------------------+-------------+------------------------------+-------------------+
> > | XWiki.XWikiTranslator |            1 |            1 | 2010-02-23
> 11:38:27
> > |             |                            1 |               163 |
> > | XWiki.XWikiTranslator |            2 |            1 | 2010-02-23
> 11:40:26
> > |             |                            1 |               330 |
> > | XWiki.rbuj            |            3 |            1 | 2010-03-04
> 00:24:24
> > |             |                            1 |               189 |
> > | XWiki.rbuj            |            4 |            1 | 2010-03-04
> 01:02:52
> > |             |                            1 |               223 |
> > | XWiki.rbuj            |            5 |            1 | 2010-07-30
> 01:12:58
> > |             |                            0 |              5026 |
> > | XWiki.XWikiTranslator |            6 |            1 | 2012-01-23
> 11:40:25
> > |             |                            1 |               115 |
> > | XWiki.XWikiTranslator |            7 |            1 | 2012-01-23
> 11:58:25
> > |             |                            1 |               115 |
> > | XWiki.XWikiTranslator |            8 |            1 | 2012-01-23
> 13:31:52
> > |             |                            1 |               234 |
> > | XWiki.XWikiTranslator |            8 |            2 | 2012-01-23
> 15:36:02
> > | Prepared    |                            1 |               115 |
> > | XWiki.XWikiTranslator |            8 |            3 | 2012-01-24
> 02:01:12
> > | Prepared    |                            0 |              5670 |
> > | XWiki.XWikiTranslator |            8 |            4 | 2012-01-25
> 02:04:00
> > | Prepared    |                            1 |               115 |
> > | XWiki.XWikiTranslator |            8 |            5 | 2012-01-26
> 02:02:29
> > | Prepared    |                            1 |               115 |
> > | XWiki.XWikiTranslator |            8 |            6 | 2012-01-27
> 02:01:18
> > | Prepared    |                            1 |               115 |
> > | XWiki.XWikiTranslator |            8 |            7 | 2012-02-04
> 02:01:08
> > | Prepared    |                            1 |               115 |
> > | XWiki.XWikiTranslator |            8 |            8 | 2012-02-07
> 02:02:03
> > | Prepared    |                            0 |              5670 |
> > | XWiki.XWikiTranslator |            8 |            9 | 2012-02-08
> 02:01:20
> > | Prepared    |                            1 |               115 |
> > | XWiki.XWikiTranslator |            8 |           10 | 2012-02-14
> 02:01:07
> > | Prepared    |                            1 |               116 |
> > | XWiki.XWikiTranslator |            8 |           11 | 2012-02-28
> 02:01:32
> > | Prepared    |                            1 |               116 |
> > | XWiki.XWikiTranslator |            8 |           12 | 2012-03-03
> 02:01:20
> > | Prepared    |                            1 |               116 |
> > | XWiki.XWikiTranslator |            8 |           13 | 2012-03-14
> 02:01:50
> > | Prepared    |                            0 |              5671 |
> > | XWiki.XWikiTranslator |            8 |           14 | 2012-03-14
> 18:23:06
> > | Prepared    |                            1 |               116 |
> > | XWiki.XWikiTranslator |            8 |           15 | 2012-03-14
> 18:32:38
> > | Prepared    |                            1 |               116 |
> > | XWiki.XWikiTranslator |            8 |           16 | 2012-03-14
> 18:37:17
> > | Prepared    |                            1 |               116 |
> > | XWiki.XWikiTranslator |            8 |           17 | 2012-03-24
> 02:01:34
> > | Prepared    |                            1 |               116 |
> > | XWiki.XWikiTranslator |            8 |           18 | 2012-03-26
> 16:40:35
> > | Prepared    |                            0 |              5671 |
> > | XWiki.XWikiTranslator |            8 |           19 | 2012-03-28
> 02:01:23
> > | Prepared    |                            1 |               116 |
> > | XWiki.XWikiTranslator |            8 |           20 | 2012-03-29
> 02:01:45
> > | Prepared    |                            1 |               116 |
> > | XWiki.XWikiTranslator |            8 |           21 | 2012-04-11
> 02:01:36
> > | Prepared    |                            1 |               116 |
> > | XWiki.XWikiTranslator |            8 |           22 | 2012-04-19
> 15:20:10
> > | Prepared    |                            1 |               116 |
> > | XWiki.XWikiTranslator |            8 |           24 | 2012-04-29
> 02:01:45
> > | Prepared    |                            1 |               116 |
> > | XWiki.XWikiTranslator |            8 |           25 | 2012-05-04
> 18:55:30
> > | Prepared    |                            1 |               116 |
> > | XWiki.XWikiTranslator |            8 |           26 | 2012-05-04
> 19:02:09
> > | Prepared    |                            1 |               116 |
> > | XWiki.XWikiTranslator |            8 |           27 | 2012-05-10
> 02:04:36
> > | Prepared    |                            1 |               116 |
> > | XWiki.XWikiTranslator |            8 |           28 | 2012-05-30
> 02:02:08
> > | Prepared    |                            0 |              5671 |
> > | XWiki.XWikiTranslator |            8 |           29 | 2012-06-07
> 10:44:28
> > | Prepared    |                            0 |              5782 |
> >
> +-----------------------+--------------+--------------+---------------------+-------------+------------------------------+-------------------+
> >
> >
> http://l10n.xwiki.org/xwiki/bin/view/XE/XEXWikiCoreResources_2107067965_core-menu-watchlist-add-page_nl?viewer=history&showminor=true
> >
> > Most entries are "no changes" and have been cause by the L10NUpdater
> which
> > wrongefully saved the document with no changes. I believe this must have
> > been fixed (by Thomas M.?)  mid 2012.
> >
> > Now the 2M lines impact performance significantly and loads the DB for
> > nothing (and in the activity stream as well).
> >
> > I suggest we clean up the history and activity stream. We have 2
> > possibilities:
> >
> > For xwikircs:
> >
> > 1/ Clean up only the bad data from XWikiTranslator when there are no
> > changes:
> >
> > This is complicated as you need to verify if the change is actually a
> > change and you cannot do that just with sql queries. It could be very
> long
> >
> > 2/ Clean up old data from pre-201206 from XWikiTranslator
> >
> > Simpler if it is safe to delete by date in the DB. After discussion with
> > sergui this is a bit complicated because you need to make sure you don't
> > delete the latest full version before the versions you keep. So you would
> > have to do it by API which will take ages.
> >
> > 3/ Clean up up old data from pre-201206 from all users
> >
> > This is simpler as you can safely delete from the database everything
> older
> > than a certain versions. Cleans-up even more but would loose contributor
> > statistics unless we store 2012 contributor counts in an alternate table
> > which would then be regularly updated
>
> +1 for 3/, without the extra stats table yet. See below.
>
> Technically, we could also use /2, since we'll still have history
> summary (who changed when), but some actual versions won't be
> retrievable. Is the actual revision important? IMHO, only if we want to
> investigate some foul play.
>
> > In any case we should probably create this intermediary table for
> > statistics as it would be much faster anyway.
> >
> > For activitystream:
> >
> > 1/ Clean up old data from XWikiTranslator 201206 or earlier
> >
> > 2/ Clean up old date from everybody 201206 or earlier
>
> +1 for 1/
>
> > What value do we see in the l10n history and actvity stream and which
> > solutions do the other commiters suggest ?
> >
> > I would say it's interesting for contributor statistics (counting number
> of
> > contribution by translators) but beyond that we can delete the data.
> > So we would fix that by storing monthly statistics in a table and
> updating
> > the latest 2 month through a scheduler job. This means that we can also
> > delete history over 2 month.
>
> We can get contributor activity from the activity stream, independently
> from the RCS table. Since the activity stream doesn't have inter-version
> dependencies like the RCS does, we can freely discard irrelevant rows
> from it, and use the remaining valid data. The same information is
> present in the activity stream and in the RCS table: who did what, when,
> and where.
>
> I mean, this is a good fast solution for the moment. We could still make
> a custom statistics data structure for translation contributors, but
> that would take longer, and thus it delays the migration of the l10n
> wiki to a newer version, and delays the performance improvement that
> we'd gain by dropping data.
>
> --
> Sergiu Dumitriu
> http://purl.org/net/sergiu
> _______________________________________________
> devs mailing list
> [hidden email]
> http://lists.xwiki.org/mailman/listinfo/devs
>



--
Ludovic Dubost
Founder and CEO
Blog: http://blog.ludovic.org/
XWiki: http://www.xwiki.com
Skype: ldubost GTalk: ldubost
_______________________________________________
devs mailing list
[hidden email]
http://lists.xwiki.org/mailman/listinfo/devs
Reply | Threaded
Open this post in threaded view
|

Re: Issue on l10n with history

Ludovic Dubost
FYI, here are the running times of these queries:

mysql> create table l10n_stats_author as select year(xwr_date)*100+
month(xwr_date) as l10n_month, XWR_AUTHOR as l10n_author, count(distinct
XWR_DOCID, XWR_VERSION1, XWR_VERSION2) as l10n_count from xwikircs group by
1,2;
Query OK, 763 rows affected (1 min 28.36 sec)
Records: 763  Duplicates: 0  Warnings: 0

mysql>   delete from l10n_stats_author where l10n_month=201301 or
l10n_month=201302;
Query OK, 29 rows affected (0.00 sec)

mysql>   insert into l10n_stats_author (select year(xwr_date)*100+
month(xwr_date) as l10n_month, XWR_AUTHOR as l10n_author, count(distinct
XWR_DOCID, XWR_VERSION1, XWR_VERSION2) as l10n_count from xwikircs where
xwr_date >= '2012-12-01' group by 1,2);
Query OK, 35 rows affected (0.07 sec)
Records: 35  Duplicates: 0  Warnings: 0

Update is very fast

Ludovic


2013/1/24 Ludovic Dubost <[hidden email]>

> The custom statistics data structure is not that complicated.
> I've already the SQL query that does the initial version (based on
> xwikircs for now)
>
>   create table l10n_stats_author as select year(xwr_date)*100+
> month(xwr_date) as l10n_month, XWR_AUTHOR as l10n_author, count(distinct
> XWR_DOCID, XWR_VERSION1, XWR_VERSION2) as l10n_count from xwikircs group by
> 1,2;
>   delete from l10n_stats_author where l10n_month=201301 or
> l10n_month=201302;
>   insert into l10n_stats_author (select year(xwr_date)*100+
> month(xwr_date) as l10n_month, XWR_AUTHOR as l10n_author, count(distinct
> XWR_DOCID, XWR_VERSION1, XWR_VERSION2) as l10n_count from xwikircs where
> xwr_date >= '2012-12-01' group by 1,2);
>   select * from l10n_stats_author where l10n_month=201301 or
> l10n_month=201302;
>
>
> Putting this in a scheduler job should not be complicated
>
> Ludovic
>
>
>
> 2013/1/24 Sergiu Dumitriu <[hidden email]>
>
>> On 01/24/2013 12:03 AM, Ludovic Dubost wrote:
>> > Working on l10n change to custom mapping showed up an issue with the
>> l10n
>> > history.
>> > We have a huge history table with 2,3M lines (as well as an activity
>> > stream) and we use the history table for extracting contributor
>> statistics.
>> >
>> > However looking at it more closely shows that the source of most data in
>> > the history table is useless.
>> > See here a grouping by dates showing that localisation of already close
>> to
>> > 2M lines in 6 month in 2012.
>> >
>> > |                             201201 |   318346 |
>> > |                             201202 |   311403 |
>> > |                             201203 |   728703 |
>> > |                             201204 |   271657 |
>> > |                             201205 |   296384 |
>> > |                             201206 |   120463 |
>> >
>> > Here is an example of history for an entry
>> >
>> > mysql> select
>> >
>> xwr_author,xwr_version1,xwr_version2,xwr_date,xwr_comment,cast(xwr_isdiff
>> > as unsigned), length(xwr_patch) from xwikircs where xwr_docid=596494851
>>  ;
>> >
>> +-----------------------+--------------+--------------+---------------------+-------------+------------------------------+-------------------+
>> > | xwr_author            | xwr_version1 | xwr_version2 | xwr_date
>> >  | xwr_comment | cast(xwr_isdiff as unsigned) | length(xwr_patch) |
>> >
>> +-----------------------+--------------+--------------+---------------------+-------------+------------------------------+-------------------+
>> > | XWiki.XWikiTranslator |            1 |            1 | 2010-02-23
>> 11:38:27
>> > |             |                            1 |               163 |
>> > | XWiki.XWikiTranslator |            2 |            1 | 2010-02-23
>> 11:40:26
>> > |             |                            1 |               330 |
>> > | XWiki.rbuj            |            3 |            1 | 2010-03-04
>> 00:24:24
>> > |             |                            1 |               189 |
>> > | XWiki.rbuj            |            4 |            1 | 2010-03-04
>> 01:02:52
>> > |             |                            1 |               223 |
>> > | XWiki.rbuj            |            5 |            1 | 2010-07-30
>> 01:12:58
>> > |             |                            0 |              5026 |
>> > | XWiki.XWikiTranslator |            6 |            1 | 2012-01-23
>> 11:40:25
>> > |             |                            1 |               115 |
>> > | XWiki.XWikiTranslator |            7 |            1 | 2012-01-23
>> 11:58:25
>> > |             |                            1 |               115 |
>> > | XWiki.XWikiTranslator |            8 |            1 | 2012-01-23
>> 13:31:52
>> > |             |                            1 |               234 |
>> > | XWiki.XWikiTranslator |            8 |            2 | 2012-01-23
>> 15:36:02
>> > | Prepared    |                            1 |               115 |
>> > | XWiki.XWikiTranslator |            8 |            3 | 2012-01-24
>> 02:01:12
>> > | Prepared    |                            0 |              5670 |
>> > | XWiki.XWikiTranslator |            8 |            4 | 2012-01-25
>> 02:04:00
>> > | Prepared    |                            1 |               115 |
>> > | XWiki.XWikiTranslator |            8 |            5 | 2012-01-26
>> 02:02:29
>> > | Prepared    |                            1 |               115 |
>> > | XWiki.XWikiTranslator |            8 |            6 | 2012-01-27
>> 02:01:18
>> > | Prepared    |                            1 |               115 |
>> > | XWiki.XWikiTranslator |            8 |            7 | 2012-02-04
>> 02:01:08
>> > | Prepared    |                            1 |               115 |
>> > | XWiki.XWikiTranslator |            8 |            8 | 2012-02-07
>> 02:02:03
>> > | Prepared    |                            0 |              5670 |
>> > | XWiki.XWikiTranslator |            8 |            9 | 2012-02-08
>> 02:01:20
>> > | Prepared    |                            1 |               115 |
>> > | XWiki.XWikiTranslator |            8 |           10 | 2012-02-14
>> 02:01:07
>> > | Prepared    |                            1 |               116 |
>> > | XWiki.XWikiTranslator |            8 |           11 | 2012-02-28
>> 02:01:32
>> > | Prepared    |                            1 |               116 |
>> > | XWiki.XWikiTranslator |            8 |           12 | 2012-03-03
>> 02:01:20
>> > | Prepared    |                            1 |               116 |
>> > | XWiki.XWikiTranslator |            8 |           13 | 2012-03-14
>> 02:01:50
>> > | Prepared    |                            0 |              5671 |
>> > | XWiki.XWikiTranslator |            8 |           14 | 2012-03-14
>> 18:23:06
>> > | Prepared    |                            1 |               116 |
>> > | XWiki.XWikiTranslator |            8 |           15 | 2012-03-14
>> 18:32:38
>> > | Prepared    |                            1 |               116 |
>> > | XWiki.XWikiTranslator |            8 |           16 | 2012-03-14
>> 18:37:17
>> > | Prepared    |                            1 |               116 |
>> > | XWiki.XWikiTranslator |            8 |           17 | 2012-03-24
>> 02:01:34
>> > | Prepared    |                            1 |               116 |
>> > | XWiki.XWikiTranslator |            8 |           18 | 2012-03-26
>> 16:40:35
>> > | Prepared    |                            0 |              5671 |
>> > | XWiki.XWikiTranslator |            8 |           19 | 2012-03-28
>> 02:01:23
>> > | Prepared    |                            1 |               116 |
>> > | XWiki.XWikiTranslator |            8 |           20 | 2012-03-29
>> 02:01:45
>> > | Prepared    |                            1 |               116 |
>> > | XWiki.XWikiTranslator |            8 |           21 | 2012-04-11
>> 02:01:36
>> > | Prepared    |                            1 |               116 |
>> > | XWiki.XWikiTranslator |            8 |           22 | 2012-04-19
>> 15:20:10
>> > | Prepared    |                            1 |               116 |
>> > | XWiki.XWikiTranslator |            8 |           24 | 2012-04-29
>> 02:01:45
>> > | Prepared    |                            1 |               116 |
>> > | XWiki.XWikiTranslator |            8 |           25 | 2012-05-04
>> 18:55:30
>> > | Prepared    |                            1 |               116 |
>> > | XWiki.XWikiTranslator |            8 |           26 | 2012-05-04
>> 19:02:09
>> > | Prepared    |                            1 |               116 |
>> > | XWiki.XWikiTranslator |            8 |           27 | 2012-05-10
>> 02:04:36
>> > | Prepared    |                            1 |               116 |
>> > | XWiki.XWikiTranslator |            8 |           28 | 2012-05-30
>> 02:02:08
>> > | Prepared    |                            0 |              5671 |
>> > | XWiki.XWikiTranslator |            8 |           29 | 2012-06-07
>> 10:44:28
>> > | Prepared    |                            0 |              5782 |
>> >
>> +-----------------------+--------------+--------------+---------------------+-------------+------------------------------+-------------------+
>> >
>> >
>> http://l10n.xwiki.org/xwiki/bin/view/XE/XEXWikiCoreResources_2107067965_core-menu-watchlist-add-page_nl?viewer=history&showminor=true
>> >
>> > Most entries are "no changes" and have been cause by the L10NUpdater
>> which
>> > wrongefully saved the document with no changes. I believe this must have
>> > been fixed (by Thomas M.?)  mid 2012.
>> >
>> > Now the 2M lines impact performance significantly and loads the DB for
>> > nothing (and in the activity stream as well).
>> >
>> > I suggest we clean up the history and activity stream. We have 2
>> > possibilities:
>> >
>> > For xwikircs:
>> >
>> > 1/ Clean up only the bad data from XWikiTranslator when there are no
>> > changes:
>> >
>> > This is complicated as you need to verify if the change is actually a
>> > change and you cannot do that just with sql queries. It could be very
>> long
>> >
>> > 2/ Clean up old data from pre-201206 from XWikiTranslator
>> >
>> > Simpler if it is safe to delete by date in the DB. After discussion with
>> > sergui this is a bit complicated because you need to make sure you don't
>> > delete the latest full version before the versions you keep. So you
>> would
>> > have to do it by API which will take ages.
>> >
>> > 3/ Clean up up old data from pre-201206 from all users
>> >
>> > This is simpler as you can safely delete from the database everything
>> older
>> > than a certain versions. Cleans-up even more but would loose contributor
>> > statistics unless we store 2012 contributor counts in an alternate table
>> > which would then be regularly updated
>>
>> +1 for 3/, without the extra stats table yet. See below.
>>
>> Technically, we could also use /2, since we'll still have history
>> summary (who changed when), but some actual versions won't be
>> retrievable. Is the actual revision important? IMHO, only if we want to
>> investigate some foul play.
>>
>> > In any case we should probably create this intermediary table for
>> > statistics as it would be much faster anyway.
>> >
>> > For activitystream:
>> >
>> > 1/ Clean up old data from XWikiTranslator 201206 or earlier
>> >
>> > 2/ Clean up old date from everybody 201206 or earlier
>>
>> +1 for 1/
>>
>> > What value do we see in the l10n history and actvity stream and which
>> > solutions do the other commiters suggest ?
>> >
>> > I would say it's interesting for contributor statistics (counting
>> number of
>> > contribution by translators) but beyond that we can delete the data.
>> > So we would fix that by storing monthly statistics in a table and
>> updating
>> > the latest 2 month through a scheduler job. This means that we can also
>> > delete history over 2 month.
>>
>> We can get contributor activity from the activity stream, independently
>> from the RCS table. Since the activity stream doesn't have inter-version
>> dependencies like the RCS does, we can freely discard irrelevant rows
>> from it, and use the remaining valid data. The same information is
>> present in the activity stream and in the RCS table: who did what, when,
>> and where.
>>
>> I mean, this is a good fast solution for the moment. We could still make
>> a custom statistics data structure for translation contributors, but
>> that would take longer, and thus it delays the migration of the l10n
>> wiki to a newer version, and delays the performance improvement that
>> we'd gain by dropping data.
>>
>> --
>> Sergiu Dumitriu
>> http://purl.org/net/sergiu
>> _______________________________________________
>> devs mailing list
>> [hidden email]
>> http://lists.xwiki.org/mailman/listinfo/devs
>>
>
>
>
> --
> Ludovic Dubost
> Founder and CEO
> Blog: http://blog.ludovic.org/
> XWiki: http://www.xwiki.com
> Skype: ldubost GTalk: ldubost
>



--
Ludovic Dubost
Founder and CEO
Blog: http://blog.ludovic.org/
XWiki: http://www.xwiki.com
Skype: ldubost GTalk: ldubost
_______________________________________________
devs mailing list
[hidden email]
http://lists.xwiki.org/mailman/listinfo/devs
Reply | Threaded
Open this post in threaded view
|

Re: Issue on l10n with history

Ludovic Dubost
Hi,

I've published a migration plan:

http://l10n.xwiki.org/xwiki/bin/view/L10N/L10NMigrationPlan

Review welcome

Ludovic


2013/1/24 Ludovic Dubost <[hidden email]>

> FYI, here are the running times of these queries:
>
> mysql> create table l10n_stats_author as select year(xwr_date)*100+
> month(xwr_date) as l10n_month, XWR_AUTHOR as l10n_author, count(distinct
> XWR_DOCID, XWR_VERSION1, XWR_VERSION2) as l10n_count from xwikircs group by
> 1,2;
> Query OK, 763 rows affected (1 min 28.36 sec)
> Records: 763  Duplicates: 0  Warnings: 0
>
> mysql>   delete from l10n_stats_author where l10n_month=201301 or
> l10n_month=201302;
> Query OK, 29 rows affected (0.00 sec)
>
> mysql>   insert into l10n_stats_author (select year(xwr_date)*100+
> month(xwr_date) as l10n_month, XWR_AUTHOR as l10n_author, count(distinct
> XWR_DOCID, XWR_VERSION1, XWR_VERSION2) as l10n_count from xwikircs where
> xwr_date >= '2012-12-01' group by 1,2);
> Query OK, 35 rows affected (0.07 sec)
> Records: 35  Duplicates: 0  Warnings: 0
>
> Update is very fast
>
> Ludovic
>
>
> 2013/1/24 Ludovic Dubost <[hidden email]>
>
>> The custom statistics data structure is not that complicated.
>> I've already the SQL query that does the initial version (based on
>> xwikircs for now)
>>
>>   create table l10n_stats_author as select year(xwr_date)*100+
>> month(xwr_date) as l10n_month, XWR_AUTHOR as l10n_author, count(distinct
>> XWR_DOCID, XWR_VERSION1, XWR_VERSION2) as l10n_count from xwikircs group by
>> 1,2;
>>   delete from l10n_stats_author where l10n_month=201301 or
>> l10n_month=201302;
>>   insert into l10n_stats_author (select year(xwr_date)*100+
>> month(xwr_date) as l10n_month, XWR_AUTHOR as l10n_author, count(distinct
>> XWR_DOCID, XWR_VERSION1, XWR_VERSION2) as l10n_count from xwikircs where
>> xwr_date >= '2012-12-01' group by 1,2);
>>   select * from l10n_stats_author where l10n_month=201301 or
>> l10n_month=201302;
>>
>>
>> Putting this in a scheduler job should not be complicated
>>
>> Ludovic
>>
>>
>>
>> 2013/1/24 Sergiu Dumitriu <[hidden email]>
>>
>>> On 01/24/2013 12:03 AM, Ludovic Dubost wrote:
>>> > Working on l10n change to custom mapping showed up an issue with the
>>> l10n
>>> > history.
>>> > We have a huge history table with 2,3M lines (as well as an activity
>>> > stream) and we use the history table for extracting contributor
>>> statistics.
>>> >
>>> > However looking at it more closely shows that the source of most data
>>> in
>>> > the history table is useless.
>>> > See here a grouping by dates showing that localisation of already
>>> close to
>>> > 2M lines in 6 month in 2012.
>>> >
>>> > |                             201201 |   318346 |
>>> > |                             201202 |   311403 |
>>> > |                             201203 |   728703 |
>>> > |                             201204 |   271657 |
>>> > |                             201205 |   296384 |
>>> > |                             201206 |   120463 |
>>> >
>>> > Here is an example of history for an entry
>>> >
>>> > mysql> select
>>> >
>>> xwr_author,xwr_version1,xwr_version2,xwr_date,xwr_comment,cast(xwr_isdiff
>>> > as unsigned), length(xwr_patch) from xwikircs where
>>> xwr_docid=596494851  ;
>>> >
>>> +-----------------------+--------------+--------------+---------------------+-------------+------------------------------+-------------------+
>>> > | xwr_author            | xwr_version1 | xwr_version2 | xwr_date
>>> >  | xwr_comment | cast(xwr_isdiff as unsigned) | length(xwr_patch) |
>>> >
>>> +-----------------------+--------------+--------------+---------------------+-------------+------------------------------+-------------------+
>>> > | XWiki.XWikiTranslator |            1 |            1 | 2010-02-23
>>> 11:38:27
>>> > |             |                            1 |               163 |
>>> > | XWiki.XWikiTranslator |            2 |            1 | 2010-02-23
>>> 11:40:26
>>> > |             |                            1 |               330 |
>>> > | XWiki.rbuj            |            3 |            1 | 2010-03-04
>>> 00:24:24
>>> > |             |                            1 |               189 |
>>> > | XWiki.rbuj            |            4 |            1 | 2010-03-04
>>> 01:02:52
>>> > |             |                            1 |               223 |
>>> > | XWiki.rbuj            |            5 |            1 | 2010-07-30
>>> 01:12:58
>>> > |             |                            0 |              5026 |
>>> > | XWiki.XWikiTranslator |            6 |            1 | 2012-01-23
>>> 11:40:25
>>> > |             |                            1 |               115 |
>>> > | XWiki.XWikiTranslator |            7 |            1 | 2012-01-23
>>> 11:58:25
>>> > |             |                            1 |               115 |
>>> > | XWiki.XWikiTranslator |            8 |            1 | 2012-01-23
>>> 13:31:52
>>> > |             |                            1 |               234 |
>>> > | XWiki.XWikiTranslator |            8 |            2 | 2012-01-23
>>> 15:36:02
>>> > | Prepared    |                            1 |               115 |
>>> > | XWiki.XWikiTranslator |            8 |            3 | 2012-01-24
>>> 02:01:12
>>> > | Prepared    |                            0 |              5670 |
>>> > | XWiki.XWikiTranslator |            8 |            4 | 2012-01-25
>>> 02:04:00
>>> > | Prepared    |                            1 |               115 |
>>> > | XWiki.XWikiTranslator |            8 |            5 | 2012-01-26
>>> 02:02:29
>>> > | Prepared    |                            1 |               115 |
>>> > | XWiki.XWikiTranslator |            8 |            6 | 2012-01-27
>>> 02:01:18
>>> > | Prepared    |                            1 |               115 |
>>> > | XWiki.XWikiTranslator |            8 |            7 | 2012-02-04
>>> 02:01:08
>>> > | Prepared    |                            1 |               115 |
>>> > | XWiki.XWikiTranslator |            8 |            8 | 2012-02-07
>>> 02:02:03
>>> > | Prepared    |                            0 |              5670 |
>>> > | XWiki.XWikiTranslator |            8 |            9 | 2012-02-08
>>> 02:01:20
>>> > | Prepared    |                            1 |               115 |
>>> > | XWiki.XWikiTranslator |            8 |           10 | 2012-02-14
>>> 02:01:07
>>> > | Prepared    |                            1 |               116 |
>>> > | XWiki.XWikiTranslator |            8 |           11 | 2012-02-28
>>> 02:01:32
>>> > | Prepared    |                            1 |               116 |
>>> > | XWiki.XWikiTranslator |            8 |           12 | 2012-03-03
>>> 02:01:20
>>> > | Prepared    |                            1 |               116 |
>>> > | XWiki.XWikiTranslator |            8 |           13 | 2012-03-14
>>> 02:01:50
>>> > | Prepared    |                            0 |              5671 |
>>> > | XWiki.XWikiTranslator |            8 |           14 | 2012-03-14
>>> 18:23:06
>>> > | Prepared    |                            1 |               116 |
>>> > | XWiki.XWikiTranslator |            8 |           15 | 2012-03-14
>>> 18:32:38
>>> > | Prepared    |                            1 |               116 |
>>> > | XWiki.XWikiTranslator |            8 |           16 | 2012-03-14
>>> 18:37:17
>>> > | Prepared    |                            1 |               116 |
>>> > | XWiki.XWikiTranslator |            8 |           17 | 2012-03-24
>>> 02:01:34
>>> > | Prepared    |                            1 |               116 |
>>> > | XWiki.XWikiTranslator |            8 |           18 | 2012-03-26
>>> 16:40:35
>>> > | Prepared    |                            0 |              5671 |
>>> > | XWiki.XWikiTranslator |            8 |           19 | 2012-03-28
>>> 02:01:23
>>> > | Prepared    |                            1 |               116 |
>>> > | XWiki.XWikiTranslator |            8 |           20 | 2012-03-29
>>> 02:01:45
>>> > | Prepared    |                            1 |               116 |
>>> > | XWiki.XWikiTranslator |            8 |           21 | 2012-04-11
>>> 02:01:36
>>> > | Prepared    |                            1 |               116 |
>>> > | XWiki.XWikiTranslator |            8 |           22 | 2012-04-19
>>> 15:20:10
>>> > | Prepared    |                            1 |               116 |
>>> > | XWiki.XWikiTranslator |            8 |           24 | 2012-04-29
>>> 02:01:45
>>> > | Prepared    |                            1 |               116 |
>>> > | XWiki.XWikiTranslator |            8 |           25 | 2012-05-04
>>> 18:55:30
>>> > | Prepared    |                            1 |               116 |
>>> > | XWiki.XWikiTranslator |            8 |           26 | 2012-05-04
>>> 19:02:09
>>> > | Prepared    |                            1 |               116 |
>>> > | XWiki.XWikiTranslator |            8 |           27 | 2012-05-10
>>> 02:04:36
>>> > | Prepared    |                            1 |               116 |
>>> > | XWiki.XWikiTranslator |            8 |           28 | 2012-05-30
>>> 02:02:08
>>> > | Prepared    |                            0 |              5671 |
>>> > | XWiki.XWikiTranslator |            8 |           29 | 2012-06-07
>>> 10:44:28
>>> > | Prepared    |                            0 |              5782 |
>>> >
>>> +-----------------------+--------------+--------------+---------------------+-------------+------------------------------+-------------------+
>>> >
>>> >
>>> http://l10n.xwiki.org/xwiki/bin/view/XE/XEXWikiCoreResources_2107067965_core-menu-watchlist-add-page_nl?viewer=history&showminor=true
>>> >
>>> > Most entries are "no changes" and have been cause by the L10NUpdater
>>> which
>>> > wrongefully saved the document with no changes. I believe this must
>>> have
>>> > been fixed (by Thomas M.?)  mid 2012.
>>> >
>>> > Now the 2M lines impact performance significantly and loads the DB for
>>> > nothing (and in the activity stream as well).
>>> >
>>> > I suggest we clean up the history and activity stream. We have 2
>>> > possibilities:
>>> >
>>> > For xwikircs:
>>> >
>>> > 1/ Clean up only the bad data from XWikiTranslator when there are no
>>> > changes:
>>> >
>>> > This is complicated as you need to verify if the change is actually a
>>> > change and you cannot do that just with sql queries. It could be very
>>> long
>>> >
>>> > 2/ Clean up old data from pre-201206 from XWikiTranslator
>>> >
>>> > Simpler if it is safe to delete by date in the DB. After discussion
>>> with
>>> > sergui this is a bit complicated because you need to make sure you
>>> don't
>>> > delete the latest full version before the versions you keep. So you
>>> would
>>> > have to do it by API which will take ages.
>>> >
>>> > 3/ Clean up up old data from pre-201206 from all users
>>> >
>>> > This is simpler as you can safely delete from the database everything
>>> older
>>> > than a certain versions. Cleans-up even more but would loose
>>> contributor
>>> > statistics unless we store 2012 contributor counts in an alternate
>>> table
>>> > which would then be regularly updated
>>>
>>> +1 for 3/, without the extra stats table yet. See below.
>>>
>>> Technically, we could also use /2, since we'll still have history
>>> summary (who changed when), but some actual versions won't be
>>> retrievable. Is the actual revision important? IMHO, only if we want to
>>> investigate some foul play.
>>>
>>> > In any case we should probably create this intermediary table for
>>> > statistics as it would be much faster anyway.
>>> >
>>> > For activitystream:
>>> >
>>> > 1/ Clean up old data from XWikiTranslator 201206 or earlier
>>> >
>>> > 2/ Clean up old date from everybody 201206 or earlier
>>>
>>> +1 for 1/
>>>
>>> > What value do we see in the l10n history and actvity stream and which
>>> > solutions do the other commiters suggest ?
>>> >
>>> > I would say it's interesting for contributor statistics (counting
>>> number of
>>> > contribution by translators) but beyond that we can delete the data.
>>> > So we would fix that by storing monthly statistics in a table and
>>> updating
>>> > the latest 2 month through a scheduler job. This means that we can also
>>> > delete history over 2 month.
>>>
>>> We can get contributor activity from the activity stream, independently
>>> from the RCS table. Since the activity stream doesn't have inter-version
>>> dependencies like the RCS does, we can freely discard irrelevant rows
>>> from it, and use the remaining valid data. The same information is
>>> present in the activity stream and in the RCS table: who did what, when,
>>> and where.
>>>
>>> I mean, this is a good fast solution for the moment. We could still make
>>> a custom statistics data structure for translation contributors, but
>>> that would take longer, and thus it delays the migration of the l10n
>>> wiki to a newer version, and delays the performance improvement that
>>> we'd gain by dropping data.
>>>
>>> --
>>> Sergiu Dumitriu
>>> http://purl.org/net/sergiu
>>> _______________________________________________
>>> devs mailing list
>>> [hidden email]
>>> http://lists.xwiki.org/mailman/listinfo/devs
>>>
>>
>>
>>
>> --
>> Ludovic Dubost
>> Founder and CEO
>> Blog: http://blog.ludovic.org/
>> XWiki: http://www.xwiki.com
>> Skype: ldubost GTalk: ldubost
>>
>
>
>
> --
> Ludovic Dubost
> Founder and CEO
> Blog: http://blog.ludovic.org/
> XWiki: http://www.xwiki.com
> Skype: ldubost GTalk: ldubost
>



--
Ludovic Dubost
Founder and CEO
Blog: http://blog.ludovic.org/
XWiki: http://www.xwiki.com
Skype: ldubost GTalk: ldubost
_______________________________________________
devs mailing list
[hidden email]
http://lists.xwiki.org/mailman/listinfo/devs
Reply | Threaded
Open this post in threaded view
|

Re: Issue on l10n with history

Ludovic Dubost
I've updated the migration plan with in particular the clean-up queries
that would delete the old history data and the old XWikiTranslator activity
stream data.
The volumes are much more reasonable now (100K activity stream and 30k
history) so I don't think we need to delete more.

I'm finalizing the custom mapping migration locally (this will add
significant history as every document will be resaved).

Can the queries be reviewed ?

http://l10n.xwiki.org/xwiki/bin/view/L10N/L10NMigrationPlan

Ludovic


2013/1/24 Ludovic Dubost <[hidden email]>

> Hi,
>
> I've published a migration plan:
>
> http://l10n.xwiki.org/xwiki/bin/view/L10N/L10NMigrationPlan
>
> Review welcome
>
> Ludovic
>
>
> 2013/1/24 Ludovic Dubost <[hidden email]>
>
>> FYI, here are the running times of these queries:
>>
>> mysql> create table l10n_stats_author as select year(xwr_date)*100+
>> month(xwr_date) as l10n_month, XWR_AUTHOR as l10n_author, count(distinct
>> XWR_DOCID, XWR_VERSION1, XWR_VERSION2) as l10n_count from xwikircs group by
>> 1,2;
>> Query OK, 763 rows affected (1 min 28.36 sec)
>> Records: 763  Duplicates: 0  Warnings: 0
>>
>> mysql>   delete from l10n_stats_author where l10n_month=201301 or
>> l10n_month=201302;
>>  Query OK, 29 rows affected (0.00 sec)
>>
>> mysql>   insert into l10n_stats_author (select year(xwr_date)*100+
>> month(xwr_date) as l10n_month, XWR_AUTHOR as l10n_author, count(distinct
>> XWR_DOCID, XWR_VERSION1, XWR_VERSION2) as l10n_count from xwikircs where
>> xwr_date >= '2012-12-01' group by 1,2);
>> Query OK, 35 rows affected (0.07 sec)
>> Records: 35  Duplicates: 0  Warnings: 0
>>
>> Update is very fast
>>
>> Ludovic
>>
>>
>> 2013/1/24 Ludovic Dubost <[hidden email]>
>>
>>> The custom statistics data structure is not that complicated.
>>> I've already the SQL query that does the initial version (based on
>>> xwikircs for now)
>>>
>>>   create table l10n_stats_author as select year(xwr_date)*100+
>>> month(xwr_date) as l10n_month, XWR_AUTHOR as l10n_author, count(distinct
>>> XWR_DOCID, XWR_VERSION1, XWR_VERSION2) as l10n_count from xwikircs group by
>>> 1,2;
>>>   delete from l10n_stats_author where l10n_month=201301 or
>>> l10n_month=201302;
>>>   insert into l10n_stats_author (select year(xwr_date)*100+
>>> month(xwr_date) as l10n_month, XWR_AUTHOR as l10n_author, count(distinct
>>> XWR_DOCID, XWR_VERSION1, XWR_VERSION2) as l10n_count from xwikircs where
>>> xwr_date >= '2012-12-01' group by 1,2);
>>>   select * from l10n_stats_author where l10n_month=201301 or
>>> l10n_month=201302;
>>>
>>>
>>> Putting this in a scheduler job should not be complicated
>>>
>>> Ludovic
>>>
>>>
>>>
>>> 2013/1/24 Sergiu Dumitriu <[hidden email]>
>>>
>>>> On 01/24/2013 12:03 AM, Ludovic Dubost wrote:
>>>> > Working on l10n change to custom mapping showed up an issue with the
>>>> l10n
>>>> > history.
>>>> > We have a huge history table with 2,3M lines (as well as an activity
>>>> > stream) and we use the history table for extracting contributor
>>>> statistics.
>>>> >
>>>> > However looking at it more closely shows that the source of most data
>>>> in
>>>> > the history table is useless.
>>>> > See here a grouping by dates showing that localisation of already
>>>> close to
>>>> > 2M lines in 6 month in 2012.
>>>> >
>>>> > |                             201201 |   318346 |
>>>> > |                             201202 |   311403 |
>>>> > |                             201203 |   728703 |
>>>> > |                             201204 |   271657 |
>>>> > |                             201205 |   296384 |
>>>> > |                             201206 |   120463 |
>>>> >
>>>> > Here is an example of history for an entry
>>>> >
>>>> > mysql> select
>>>> >
>>>> xwr_author,xwr_version1,xwr_version2,xwr_date,xwr_comment,cast(xwr_isdiff
>>>> > as unsigned), length(xwr_patch) from xwikircs where
>>>> xwr_docid=596494851  ;
>>>> >
>>>> +-----------------------+--------------+--------------+---------------------+-------------+------------------------------+-------------------+
>>>> > | xwr_author            | xwr_version1 | xwr_version2 | xwr_date
>>>> >  | xwr_comment | cast(xwr_isdiff as unsigned) | length(xwr_patch) |
>>>> >
>>>> +-----------------------+--------------+--------------+---------------------+-------------+------------------------------+-------------------+
>>>> > | XWiki.XWikiTranslator |            1 |            1 | 2010-02-23
>>>> 11:38:27
>>>> > |             |                            1 |               163 |
>>>> > | XWiki.XWikiTranslator |            2 |            1 | 2010-02-23
>>>> 11:40:26
>>>> > |             |                            1 |               330 |
>>>> > | XWiki.rbuj            |            3 |            1 | 2010-03-04
>>>> 00:24:24
>>>> > |             |                            1 |               189 |
>>>> > | XWiki.rbuj            |            4 |            1 | 2010-03-04
>>>> 01:02:52
>>>> > |             |                            1 |               223 |
>>>> > | XWiki.rbuj            |            5 |            1 | 2010-07-30
>>>> 01:12:58
>>>> > |             |                            0 |              5026 |
>>>> > | XWiki.XWikiTranslator |            6 |            1 | 2012-01-23
>>>> 11:40:25
>>>> > |             |                            1 |               115 |
>>>> > | XWiki.XWikiTranslator |            7 |            1 | 2012-01-23
>>>> 11:58:25
>>>> > |             |                            1 |               115 |
>>>> > | XWiki.XWikiTranslator |            8 |            1 | 2012-01-23
>>>> 13:31:52
>>>> > |             |                            1 |               234 |
>>>> > | XWiki.XWikiTranslator |            8 |            2 | 2012-01-23
>>>> 15:36:02
>>>> > | Prepared    |                            1 |               115 |
>>>> > | XWiki.XWikiTranslator |            8 |            3 | 2012-01-24
>>>> 02:01:12
>>>> > | Prepared    |                            0 |              5670 |
>>>> > | XWiki.XWikiTranslator |            8 |            4 | 2012-01-25
>>>> 02:04:00
>>>> > | Prepared    |                            1 |               115 |
>>>> > | XWiki.XWikiTranslator |            8 |            5 | 2012-01-26
>>>> 02:02:29
>>>> > | Prepared    |                            1 |               115 |
>>>> > | XWiki.XWikiTranslator |            8 |            6 | 2012-01-27
>>>> 02:01:18
>>>> > | Prepared    |                            1 |               115 |
>>>> > | XWiki.XWikiTranslator |            8 |            7 | 2012-02-04
>>>> 02:01:08
>>>> > | Prepared    |                            1 |               115 |
>>>> > | XWiki.XWikiTranslator |            8 |            8 | 2012-02-07
>>>> 02:02:03
>>>> > | Prepared    |                            0 |              5670 |
>>>> > | XWiki.XWikiTranslator |            8 |            9 | 2012-02-08
>>>> 02:01:20
>>>> > | Prepared    |                            1 |               115 |
>>>> > | XWiki.XWikiTranslator |            8 |           10 | 2012-02-14
>>>> 02:01:07
>>>> > | Prepared    |                            1 |               116 |
>>>> > | XWiki.XWikiTranslator |            8 |           11 | 2012-02-28
>>>> 02:01:32
>>>> > | Prepared    |                            1 |               116 |
>>>> > | XWiki.XWikiTranslator |            8 |           12 | 2012-03-03
>>>> 02:01:20
>>>> > | Prepared    |                            1 |               116 |
>>>> > | XWiki.XWikiTranslator |            8 |           13 | 2012-03-14
>>>> 02:01:50
>>>> > | Prepared    |                            0 |              5671 |
>>>> > | XWiki.XWikiTranslator |            8 |           14 | 2012-03-14
>>>> 18:23:06
>>>> > | Prepared    |                            1 |               116 |
>>>> > | XWiki.XWikiTranslator |            8 |           15 | 2012-03-14
>>>> 18:32:38
>>>> > | Prepared    |                            1 |               116 |
>>>> > | XWiki.XWikiTranslator |            8 |           16 | 2012-03-14
>>>> 18:37:17
>>>> > | Prepared    |                            1 |               116 |
>>>> > | XWiki.XWikiTranslator |            8 |           17 | 2012-03-24
>>>> 02:01:34
>>>> > | Prepared    |                            1 |               116 |
>>>> > | XWiki.XWikiTranslator |            8 |           18 | 2012-03-26
>>>> 16:40:35
>>>> > | Prepared    |                            0 |              5671 |
>>>> > | XWiki.XWikiTranslator |            8 |           19 | 2012-03-28
>>>> 02:01:23
>>>> > | Prepared    |                            1 |               116 |
>>>> > | XWiki.XWikiTranslator |            8 |           20 | 2012-03-29
>>>> 02:01:45
>>>> > | Prepared    |                            1 |               116 |
>>>> > | XWiki.XWikiTranslator |            8 |           21 | 2012-04-11
>>>> 02:01:36
>>>> > | Prepared    |                            1 |               116 |
>>>> > | XWiki.XWikiTranslator |            8 |           22 | 2012-04-19
>>>> 15:20:10
>>>> > | Prepared    |                            1 |               116 |
>>>> > | XWiki.XWikiTranslator |            8 |           24 | 2012-04-29
>>>> 02:01:45
>>>> > | Prepared    |                            1 |               116 |
>>>> > | XWiki.XWikiTranslator |            8 |           25 | 2012-05-04
>>>> 18:55:30
>>>> > | Prepared    |                            1 |               116 |
>>>> > | XWiki.XWikiTranslator |            8 |           26 | 2012-05-04
>>>> 19:02:09
>>>> > | Prepared    |                            1 |               116 |
>>>> > | XWiki.XWikiTranslator |            8 |           27 | 2012-05-10
>>>> 02:04:36
>>>> > | Prepared    |                            1 |               116 |
>>>> > | XWiki.XWikiTranslator |            8 |           28 | 2012-05-30
>>>> 02:02:08
>>>> > | Prepared    |                            0 |              5671 |
>>>> > | XWiki.XWikiTranslator |            8 |           29 | 2012-06-07
>>>> 10:44:28
>>>> > | Prepared    |                            0 |              5782 |
>>>> >
>>>> +-----------------------+--------------+--------------+---------------------+-------------+------------------------------+-------------------+
>>>> >
>>>> >
>>>> http://l10n.xwiki.org/xwiki/bin/view/XE/XEXWikiCoreResources_2107067965_core-menu-watchlist-add-page_nl?viewer=history&showminor=true
>>>> >
>>>> > Most entries are "no changes" and have been cause by the L10NUpdater
>>>> which
>>>> > wrongefully saved the document with no changes. I believe this must
>>>> have
>>>> > been fixed (by Thomas M.?)  mid 2012.
>>>> >
>>>> > Now the 2M lines impact performance significantly and loads the DB for
>>>> > nothing (and in the activity stream as well).
>>>> >
>>>> > I suggest we clean up the history and activity stream. We have 2
>>>> > possibilities:
>>>> >
>>>> > For xwikircs:
>>>> >
>>>> > 1/ Clean up only the bad data from XWikiTranslator when there are no
>>>> > changes:
>>>> >
>>>> > This is complicated as you need to verify if the change is actually a
>>>> > change and you cannot do that just with sql queries. It could be very
>>>> long
>>>> >
>>>> > 2/ Clean up old data from pre-201206 from XWikiTranslator
>>>> >
>>>> > Simpler if it is safe to delete by date in the DB. After discussion
>>>> with
>>>> > sergui this is a bit complicated because you need to make sure you
>>>> don't
>>>> > delete the latest full version before the versions you keep. So you
>>>> would
>>>> > have to do it by API which will take ages.
>>>> >
>>>> > 3/ Clean up up old data from pre-201206 from all users
>>>> >
>>>> > This is simpler as you can safely delete from the database everything
>>>> older
>>>> > than a certain versions. Cleans-up even more but would loose
>>>> contributor
>>>> > statistics unless we store 2012 contributor counts in an alternate
>>>> table
>>>> > which would then be regularly updated
>>>>
>>>> +1 for 3/, without the extra stats table yet. See below.
>>>>
>>>> Technically, we could also use /2, since we'll still have history
>>>> summary (who changed when), but some actual versions won't be
>>>> retrievable. Is the actual revision important? IMHO, only if we want to
>>>> investigate some foul play.
>>>>
>>>> > In any case we should probably create this intermediary table for
>>>> > statistics as it would be much faster anyway.
>>>> >
>>>> > For activitystream:
>>>> >
>>>> > 1/ Clean up old data from XWikiTranslator 201206 or earlier
>>>> >
>>>> > 2/ Clean up old date from everybody 201206 or earlier
>>>>
>>>> +1 for 1/
>>>>
>>>> > What value do we see in the l10n history and actvity stream and which
>>>> > solutions do the other commiters suggest ?
>>>> >
>>>> > I would say it's interesting for contributor statistics (counting
>>>> number of
>>>> > contribution by translators) but beyond that we can delete the data.
>>>> > So we would fix that by storing monthly statistics in a table and
>>>> updating
>>>> > the latest 2 month through a scheduler job. This means that we can
>>>> also
>>>> > delete history over 2 month.
>>>>
>>>> We can get contributor activity from the activity stream, independently
>>>> from the RCS table. Since the activity stream doesn't have inter-version
>>>> dependencies like the RCS does, we can freely discard irrelevant rows
>>>> from it, and use the remaining valid data. The same information is
>>>> present in the activity stream and in the RCS table: who did what, when,
>>>> and where.
>>>>
>>>> I mean, this is a good fast solution for the moment. We could still make
>>>> a custom statistics data structure for translation contributors, but
>>>> that would take longer, and thus it delays the migration of the l10n
>>>> wiki to a newer version, and delays the performance improvement that
>>>> we'd gain by dropping data.
>>>>
>>>> --
>>>> Sergiu Dumitriu
>>>> http://purl.org/net/sergiu
>>>> _______________________________________________
>>>> devs mailing list
>>>> [hidden email]
>>>> http://lists.xwiki.org/mailman/listinfo/devs
>>>>
>>>
>>>
>>>
>>> --
>>> Ludovic Dubost
>>> Founder and CEO
>>> Blog: http://blog.ludovic.org/
>>> XWiki: http://www.xwiki.com
>>> Skype: ldubost GTalk: ldubost
>>>
>>
>>
>>
>> --
>> Ludovic Dubost
>> Founder and CEO
>> Blog: http://blog.ludovic.org/
>> XWiki: http://www.xwiki.com
>> Skype: ldubost GTalk: ldubost
>>
>
>
>
> --
> Ludovic Dubost
> Founder and CEO
> Blog: http://blog.ludovic.org/
> XWiki: http://www.xwiki.com
> Skype: ldubost GTalk: ldubost
>



--
Ludovic Dubost
Founder and CEO
Blog: http://blog.ludovic.org/
XWiki: http://www.xwiki.com
Skype: ldubost GTalk: ldubost
_______________________________________________
devs mailing list
[hidden email]
http://lists.xwiki.org/mailman/listinfo/devs
Reply | Threaded
Open this post in threaded view
|

Re: Issue on l10n with history

Ludovic Dubost
I have a question on the strategy to move to custom mapping.

1/ One strategy is to resave all documents that use custom mapping and from
now on, custom mapping will the the source of data.
2/ Another stragegy would be to move the data using SQL.

I also see that although the data is copied over on save to the custom
mapping table the old data is not removed in the standard xwiki tables
(this should probably be reported as a bug) and then should be manually
deleted, so I'm starting to think that the second strategy would be the
fastes and most efficient one.

Any experience in this area ?

Ludovic


2013/1/29 Ludovic Dubost <[hidden email]>

> I've updated the migration plan with in particular the clean-up queries
> that would delete the old history data and the old XWikiTranslator activity
> stream data.
> The volumes are much more reasonable now (100K activity stream and 30k
> history) so I don't think we need to delete more.
>
> I'm finalizing the custom mapping migration locally (this will add
> significant history as every document will be resaved).
>
> Can the queries be reviewed ?
>
> http://l10n.xwiki.org/xwiki/bin/view/L10N/L10NMigrationPlan
>
> Ludovic
>
>
> 2013/1/24 Ludovic Dubost <[hidden email]>
>
>> Hi,
>>
>> I've published a migration plan:
>>
>> http://l10n.xwiki.org/xwiki/bin/view/L10N/L10NMigrationPlan
>>
>> Review welcome
>>
>> Ludovic
>>
>>
>> 2013/1/24 Ludovic Dubost <[hidden email]>
>>
>>> FYI, here are the running times of these queries:
>>>
>>> mysql> create table l10n_stats_author as select year(xwr_date)*100+
>>> month(xwr_date) as l10n_month, XWR_AUTHOR as l10n_author, count(distinct
>>> XWR_DOCID, XWR_VERSION1, XWR_VERSION2) as l10n_count from xwikircs group by
>>> 1,2;
>>> Query OK, 763 rows affected (1 min 28.36 sec)
>>> Records: 763  Duplicates: 0  Warnings: 0
>>>
>>> mysql>   delete from l10n_stats_author where l10n_month=201301 or
>>> l10n_month=201302;
>>> Query OK, 29 rows affected (0.00 sec)
>>>
>>> mysql>   insert into l10n_stats_author (select year(xwr_date)*100+
>>> month(xwr_date) as l10n_month, XWR_AUTHOR as l10n_author, count(distinct
>>> XWR_DOCID, XWR_VERSION1, XWR_VERSION2) as l10n_count from xwikircs where
>>> xwr_date >= '2012-12-01' group by 1,2);
>>> Query OK, 35 rows affected (0.07 sec)
>>> Records: 35  Duplicates: 0  Warnings: 0
>>>
>>> Update is very fast
>>>
>>> Ludovic
>>>
>>>
>>> 2013/1/24 Ludovic Dubost <[hidden email]>
>>>
>>>> The custom statistics data structure is not that complicated.
>>>> I've already the SQL query that does the initial version (based on
>>>> xwikircs for now)
>>>>
>>>>   create table l10n_stats_author as select year(xwr_date)*100+
>>>> month(xwr_date) as l10n_month, XWR_AUTHOR as l10n_author, count(distinct
>>>> XWR_DOCID, XWR_VERSION1, XWR_VERSION2) as l10n_count from xwikircs group by
>>>> 1,2;
>>>>   delete from l10n_stats_author where l10n_month=201301 or
>>>> l10n_month=201302;
>>>>   insert into l10n_stats_author (select year(xwr_date)*100+
>>>> month(xwr_date) as l10n_month, XWR_AUTHOR as l10n_author, count(distinct
>>>> XWR_DOCID, XWR_VERSION1, XWR_VERSION2) as l10n_count from xwikircs where
>>>> xwr_date >= '2012-12-01' group by 1,2);
>>>>   select * from l10n_stats_author where l10n_month=201301 or
>>>> l10n_month=201302;
>>>>
>>>>
>>>> Putting this in a scheduler job should not be complicated
>>>>
>>>> Ludovic
>>>>
>>>>
>>>>
>>>> 2013/1/24 Sergiu Dumitriu <[hidden email]>
>>>>
>>>>> On 01/24/2013 12:03 AM, Ludovic Dubost wrote:
>>>>> > Working on l10n change to custom mapping showed up an issue with the
>>>>> l10n
>>>>> > history.
>>>>> > We have a huge history table with 2,3M lines (as well as an activity
>>>>> > stream) and we use the history table for extracting contributor
>>>>> statistics.
>>>>> >
>>>>> > However looking at it more closely shows that the source of most
>>>>> data in
>>>>> > the history table is useless.
>>>>> > See here a grouping by dates showing that localisation of already
>>>>> close to
>>>>> > 2M lines in 6 month in 2012.
>>>>> >
>>>>> > |                             201201 |   318346 |
>>>>> > |                             201202 |   311403 |
>>>>> > |                             201203 |   728703 |
>>>>> > |                             201204 |   271657 |
>>>>> > |                             201205 |   296384 |
>>>>> > |                             201206 |   120463 |
>>>>> >
>>>>> > Here is an example of history for an entry
>>>>> >
>>>>> > mysql> select
>>>>> >
>>>>> xwr_author,xwr_version1,xwr_version2,xwr_date,xwr_comment,cast(xwr_isdiff
>>>>> > as unsigned), length(xwr_patch) from xwikircs where
>>>>> xwr_docid=596494851  ;
>>>>> >
>>>>> +-----------------------+--------------+--------------+---------------------+-------------+------------------------------+-------------------+
>>>>> > | xwr_author            | xwr_version1 | xwr_version2 | xwr_date
>>>>> >  | xwr_comment | cast(xwr_isdiff as unsigned) | length(xwr_patch) |
>>>>> >
>>>>> +-----------------------+--------------+--------------+---------------------+-------------+------------------------------+-------------------+
>>>>> > | XWiki.XWikiTranslator |            1 |            1 | 2010-02-23
>>>>> 11:38:27
>>>>> > |             |                            1 |               163 |
>>>>> > | XWiki.XWikiTranslator |            2 |            1 | 2010-02-23
>>>>> 11:40:26
>>>>> > |             |                            1 |               330 |
>>>>> > | XWiki.rbuj            |            3 |            1 | 2010-03-04
>>>>> 00:24:24
>>>>> > |             |                            1 |               189 |
>>>>> > | XWiki.rbuj            |            4 |            1 | 2010-03-04
>>>>> 01:02:52
>>>>> > |             |                            1 |               223 |
>>>>> > | XWiki.rbuj            |            5 |            1 | 2010-07-30
>>>>> 01:12:58
>>>>> > |             |                            0 |              5026 |
>>>>> > | XWiki.XWikiTranslator |            6 |            1 | 2012-01-23
>>>>> 11:40:25
>>>>> > |             |                            1 |               115 |
>>>>> > | XWiki.XWikiTranslator |            7 |            1 | 2012-01-23
>>>>> 11:58:25
>>>>> > |             |                            1 |               115 |
>>>>> > | XWiki.XWikiTranslator |            8 |            1 | 2012-01-23
>>>>> 13:31:52
>>>>> > |             |                            1 |               234 |
>>>>> > | XWiki.XWikiTranslator |            8 |            2 | 2012-01-23
>>>>> 15:36:02
>>>>> > | Prepared    |                            1 |               115 |
>>>>> > | XWiki.XWikiTranslator |            8 |            3 | 2012-01-24
>>>>> 02:01:12
>>>>> > | Prepared    |                            0 |              5670 |
>>>>> > | XWiki.XWikiTranslator |            8 |            4 | 2012-01-25
>>>>> 02:04:00
>>>>> > | Prepared    |                            1 |               115 |
>>>>> > | XWiki.XWikiTranslator |            8 |            5 | 2012-01-26
>>>>> 02:02:29
>>>>> > | Prepared    |                            1 |               115 |
>>>>> > | XWiki.XWikiTranslator |            8 |            6 | 2012-01-27
>>>>> 02:01:18
>>>>> > | Prepared    |                            1 |               115 |
>>>>> > | XWiki.XWikiTranslator |            8 |            7 | 2012-02-04
>>>>> 02:01:08
>>>>> > | Prepared    |                            1 |               115 |
>>>>> > | XWiki.XWikiTranslator |            8 |            8 | 2012-02-07
>>>>> 02:02:03
>>>>> > | Prepared    |                            0 |              5670 |
>>>>> > | XWiki.XWikiTranslator |            8 |            9 | 2012-02-08
>>>>> 02:01:20
>>>>> > | Prepared    |                            1 |               115 |
>>>>> > | XWiki.XWikiTranslator |            8 |           10 | 2012-02-14
>>>>> 02:01:07
>>>>> > | Prepared    |                            1 |               116 |
>>>>> > | XWiki.XWikiTranslator |            8 |           11 | 2012-02-28
>>>>> 02:01:32
>>>>> > | Prepared    |                            1 |               116 |
>>>>> > | XWiki.XWikiTranslator |            8 |           12 | 2012-03-03
>>>>> 02:01:20
>>>>> > | Prepared    |                            1 |               116 |
>>>>> > | XWiki.XWikiTranslator |            8 |           13 | 2012-03-14
>>>>> 02:01:50
>>>>> > | Prepared    |                            0 |              5671 |
>>>>> > | XWiki.XWikiTranslator |            8 |           14 | 2012-03-14
>>>>> 18:23:06
>>>>> > | Prepared    |                            1 |               116 |
>>>>> > | XWiki.XWikiTranslator |            8 |           15 | 2012-03-14
>>>>> 18:32:38
>>>>> > | Prepared    |                            1 |               116 |
>>>>> > | XWiki.XWikiTranslator |            8 |           16 | 2012-03-14
>>>>> 18:37:17
>>>>> > | Prepared    |                            1 |               116 |
>>>>> > | XWiki.XWikiTranslator |            8 |           17 | 2012-03-24
>>>>> 02:01:34
>>>>> > | Prepared    |                            1 |               116 |
>>>>> > | XWiki.XWikiTranslator |            8 |           18 | 2012-03-26
>>>>> 16:40:35
>>>>> > | Prepared    |                            0 |              5671 |
>>>>> > | XWiki.XWikiTranslator |            8 |           19 | 2012-03-28
>>>>> 02:01:23
>>>>> > | Prepared    |                            1 |               116 |
>>>>> > | XWiki.XWikiTranslator |            8 |           20 | 2012-03-29
>>>>> 02:01:45
>>>>> > | Prepared    |                            1 |               116 |
>>>>> > | XWiki.XWikiTranslator |            8 |           21 | 2012-04-11
>>>>> 02:01:36
>>>>> > | Prepared    |                            1 |               116 |
>>>>> > | XWiki.XWikiTranslator |            8 |           22 | 2012-04-19
>>>>> 15:20:10
>>>>> > | Prepared    |                            1 |               116 |
>>>>> > | XWiki.XWikiTranslator |            8 |           24 | 2012-04-29
>>>>> 02:01:45
>>>>> > | Prepared    |                            1 |               116 |
>>>>> > | XWiki.XWikiTranslator |            8 |           25 | 2012-05-04
>>>>> 18:55:30
>>>>> > | Prepared    |                            1 |               116 |
>>>>> > | XWiki.XWikiTranslator |            8 |           26 | 2012-05-04
>>>>> 19:02:09
>>>>> > | Prepared    |                            1 |               116 |
>>>>> > | XWiki.XWikiTranslator |            8 |           27 | 2012-05-10
>>>>> 02:04:36
>>>>> > | Prepared    |                            1 |               116 |
>>>>> > | XWiki.XWikiTranslator |            8 |           28 | 2012-05-30
>>>>> 02:02:08
>>>>> > | Prepared    |                            0 |              5671 |
>>>>> > | XWiki.XWikiTranslator |            8 |           29 | 2012-06-07
>>>>> 10:44:28
>>>>> > | Prepared    |                            0 |              5782 |
>>>>> >
>>>>> +-----------------------+--------------+--------------+---------------------+-------------+------------------------------+-------------------+
>>>>> >
>>>>> >
>>>>> http://l10n.xwiki.org/xwiki/bin/view/XE/XEXWikiCoreResources_2107067965_core-menu-watchlist-add-page_nl?viewer=history&showminor=true
>>>>> >
>>>>> > Most entries are "no changes" and have been cause by the L10NUpdater
>>>>> which
>>>>> > wrongefully saved the document with no changes. I believe this must
>>>>> have
>>>>> > been fixed (by Thomas M.?)  mid 2012.
>>>>> >
>>>>> > Now the 2M lines impact performance significantly and loads the DB
>>>>> for
>>>>> > nothing (and in the activity stream as well).
>>>>> >
>>>>> > I suggest we clean up the history and activity stream. We have 2
>>>>> > possibilities:
>>>>> >
>>>>> > For xwikircs:
>>>>> >
>>>>> > 1/ Clean up only the bad data from XWikiTranslator when there are no
>>>>> > changes:
>>>>> >
>>>>> > This is complicated as you need to verify if the change is actually a
>>>>> > change and you cannot do that just with sql queries. It could be
>>>>> very long
>>>>> >
>>>>> > 2/ Clean up old data from pre-201206 from XWikiTranslator
>>>>> >
>>>>> > Simpler if it is safe to delete by date in the DB. After discussion
>>>>> with
>>>>> > sergui this is a bit complicated because you need to make sure you
>>>>> don't
>>>>> > delete the latest full version before the versions you keep. So you
>>>>> would
>>>>> > have to do it by API which will take ages.
>>>>> >
>>>>> > 3/ Clean up up old data from pre-201206 from all users
>>>>> >
>>>>> > This is simpler as you can safely delete from the database
>>>>> everything older
>>>>> > than a certain versions. Cleans-up even more but would loose
>>>>> contributor
>>>>> > statistics unless we store 2012 contributor counts in an alternate
>>>>> table
>>>>> > which would then be regularly updated
>>>>>
>>>>> +1 for 3/, without the extra stats table yet. See below.
>>>>>
>>>>> Technically, we could also use /2, since we'll still have history
>>>>> summary (who changed when), but some actual versions won't be
>>>>> retrievable. Is the actual revision important? IMHO, only if we want to
>>>>> investigate some foul play.
>>>>>
>>>>> > In any case we should probably create this intermediary table for
>>>>> > statistics as it would be much faster anyway.
>>>>> >
>>>>> > For activitystream:
>>>>> >
>>>>> > 1/ Clean up old data from XWikiTranslator 201206 or earlier
>>>>> >
>>>>> > 2/ Clean up old date from everybody 201206 or earlier
>>>>>
>>>>> +1 for 1/
>>>>>
>>>>> > What value do we see in the l10n history and actvity stream and which
>>>>> > solutions do the other commiters suggest ?
>>>>> >
>>>>> > I would say it's interesting for contributor statistics (counting
>>>>> number of
>>>>> > contribution by translators) but beyond that we can delete the data.
>>>>> > So we would fix that by storing monthly statistics in a table and
>>>>> updating
>>>>> > the latest 2 month through a scheduler job. This means that we can
>>>>> also
>>>>> > delete history over 2 month.
>>>>>
>>>>> We can get contributor activity from the activity stream, independently
>>>>> from the RCS table. Since the activity stream doesn't have
>>>>> inter-version
>>>>> dependencies like the RCS does, we can freely discard irrelevant rows
>>>>> from it, and use the remaining valid data. The same information is
>>>>> present in the activity stream and in the RCS table: who did what,
>>>>> when,
>>>>> and where.
>>>>>
>>>>> I mean, this is a good fast solution for the moment. We could still
>>>>> make
>>>>> a custom statistics data structure for translation contributors, but
>>>>> that would take longer, and thus it delays the migration of the l10n
>>>>> wiki to a newer version, and delays the performance improvement that
>>>>> we'd gain by dropping data.
>>>>>
>>>>> --
>>>>> Sergiu Dumitriu
>>>>> http://purl.org/net/sergiu
>>>>> _______________________________________________
>>>>> devs mailing list
>>>>> [hidden email]
>>>>> http://lists.xwiki.org/mailman/listinfo/devs
>>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Ludovic Dubost
>>>> Founder and CEO
>>>> Blog: http://blog.ludovic.org/
>>>> XWiki: http://www.xwiki.com
>>>> Skype: ldubost GTalk: ldubost
>>>>
>>>
>>>
>>>
>>> --
>>> Ludovic Dubost
>>> Founder and CEO
>>> Blog: http://blog.ludovic.org/
>>> XWiki: http://www.xwiki.com
>>> Skype: ldubost GTalk: ldubost
>>>
>>
>>
>>
>> --
>> Ludovic Dubost
>> Founder and CEO
>> Blog: http://blog.ludovic.org/
>> XWiki: http://www.xwiki.com
>> Skype: ldubost GTalk: ldubost
>>
>
>
>
> --
> Ludovic Dubost
> Founder and CEO
> Blog: http://blog.ludovic.org/
> XWiki: http://www.xwiki.com
> Skype: ldubost GTalk: ldubost
>



--
Ludovic Dubost
Founder and CEO
Blog: http://blog.ludovic.org/
XWiki: http://www.xwiki.com
Skype: ldubost GTalk: ldubost
_______________________________________________
devs mailing list
[hidden email]
http://lists.xwiki.org/mailman/listinfo/devs