Jump to content

Double empty lines after upgrade. Plan to fix this?


estan

Recommended Posts

I've got a test CLI script working.  It needs a bit of configuration at the top.

// CHANGEABLE VARIABLES

// If TRUE then no changes will be made but you will see the number of potential changes
$testMode = FALSE;

// Starting post id (pid)
$startPost = 0;

// Ending post id - This pid IS processed. Use 9999999 for no end limit. 
$endPost = 9999999;	

// The custom class to use - <p class="customclass'>. This can be optionally left for manual SQL cleanup.
$customClass = 'doublepara-processed';
$leaveCustomClass = TRUE;

I ran it against my test migrated board (1.2 million posts) and it took a couple of minutes - I'm checking posts at the moment and it looks good so far.  Interestingly it only started needing to process posts at around the 800k post mark. I'm guessing that's when I upgraded to 3.4. The posts before that all seem to use pure <br> for paragraphs so they aren't touched. In the end it needed to alter 62k posts out of 1.2 million.

If you would like to test the script then send me a personal message and I'll give it to you. Strictly on the condition that you are responsible for backing up your board or using a test board. No moaning if it goes wrong!

For the techie, the regexs I ended up using trap a lot of edge cases - it's not a simple replace all. If anyone can see a flaw then please shout.

// First change double empty paragraphs into single empty paragraphs and protect them
$newPost = preg_replace('#(?<=</p>)(\s*)<p>&nbsp;</p>\s*<p>&nbsp;</p>#', '$1<p class="' . $customClass . '">&nbsp</p>', $oldPost);

// Remove remaining single paragraphs 
$newPost = preg_replace('#(?<=</p>)(\s*)<p>&nbsp;</p>#', '$1', $newPost);

// Restore protected paragraphs (optional)
if (!$leaveCustomClass)
{
	$newPost = str_replace('<p class="' . $customClass . '">&nbsp</p>', '<p>&nbsp;</p>', $newPost);
}

 

Link to comment
Share on other sites

If you would like to test the script then send me a personal message and I'll give it to you. Strictly on the condition that you are responsible for backing up your board or using a test board. No moaning if it goes wrong!

@AutoItScript I very very much appreciate the help!

I will send you a personal message in a minute. I definitely agree to be responsible. I will do a database backup on the live site (once it is migrated to 4.0.2) and I also plan to first run the script on my test board. So, no worries :) 

Link to comment
Share on other sites

I did a test upgrade from 3.4.7 to 4.0.1. There were few minor issues here and there but overall all looks good.

The most major concern so far I have after the upgrade is the double lining/spacing on articles/posts etc. Many posts and articles, where an extra empty line or 2 was put intentionally to better separate paragraphs in 3.4.7, look with way too much extra space after the upgrade to 4.0.1. As I understand it this is because of the difference of the editors and how they handle paragraphs, new lines etc etc.

This is a big concern for me. Obviously, I can not go and check one by one many thousands of posts, articles etc. and fix them manually. And they definitely do not look good now. This is maybe the only thing that prevents me to upgrade now. Anyone any ideas for me? 

@Charles or @bfarber or anyone else from IPS, any thoughts or suggestions here? Any way to deal with this in a smart and automatic way?
What about if during upgrade there is an optional checkbox for automatic removal of extra empty lines? Or something like that? Or any way to deal with this after the upgrade?

 

​I think with bugs being found and issues still to be iron out it best to just wait till it officially and publicly release and wait to upgrade. If you upgrade before that happens your going to be a tester and will have problems and one understand that by upgrading.

Link to comment
Share on other sites

  • 2 months later...

WARNING: there may be some cases where people had intentionally hit enter many times, but I consider the loss of those intentional spaces on a handful of posts a small price to pay for fixing the majority. Oh, and always take a backup of your database before doing anything like this:

The three queries in the attached text file run in phpMyAdmin fixed it for >40k of my posts that were affected. Sorry I had to attach it as a text file but a big security error popped up when I tried typing mySQL queries into the editor here (not too surprised really).

remove blank paragraphs.txt

Link to comment
Share on other sites

  • 3 months later...
On 22.4.2015, AutoItScript said:

If you would like to test the script then send me a personal message and I'll give it to you. Strictly on the condition that you are responsible for backing up your board or using a test board. No moaning if it goes wrong!

I recently updated from Vb 4 to ipb 4 and my old post are messed up. It would be very kind, can u give me the script please?

Link to comment
Share on other sites

  • 1 month later...

This happened to my board when I did an upgrade test on my local machine.

I used the command line to make the changes.

1.  BACKUP THE DATABASE!

2.  Get into mysql by typing mysql into the prompt.  You might have to enter your mysql root password here.

3.  Select the database USE [database name] without the brackets

4.  Enter what is below (from @Peter.B above), one at a time REPLACING "ipb_" with whatever you used as the prefix for your database.

update `ipb_forums_posts` set post = replace(post,"<p> </p>","");
update `ipb_forums_posts` set post = replace(post,"<p>&nbsp;</p>","");
update `ipb_forums_posts` set post = replace(post,"<p></p>","");

Additional note:  I found that adding extra spaces in the first one betweet <p> </p> found additional hits.  I believe I went out to six spaces.  Each additional space, I ran that query again.

Link to comment
Share on other sites

I've been doing some of these post format corrections in the database since yesterday after I updated the board to 4.1 as a lot of posts/topics are a mess. I have noticed some posts have this problem in their content...

</p><br><p> </p><br><p>

I tried running the following query on the post column to clear this up - without any success...

update `ipb_forums_posts` set post = replace(post,"<br><p> </p><br>","");

...it is always returning zero rows affected. The string expression to find is exactly as it is found in the database. Any ideas what I am missing here?

Ted.

Link to comment
Share on other sites

8 hours ago, Teddy Rogers said:

I've been doing some of these post format corrections in the database since yesterday after I updated the board to 4.1 as a lot of posts/topics are a mess. I have noticed some posts have this problem in their content...


</p><br><p> </p><br><p>

I tried running the following query on the post column to clear this up - without any success...


update `ipb_forums_posts` set post = replace(post,"<br><p> </p><br>","");

...it is always returning zero rows affected. The string expression to find is exactly as it is found in the database. Any ideas what I am missing here?

Ted.

I have the same code in my converted board.  I've been trying to replace that for about a 1/2 hour with no luck.  I noticed when I was replacing an issue with the twitter embeds that I had to use the html name "&gt;" for the ">" which I was replacing.  I tried to do that with this query, but I got the same results as you.

Link to comment
Share on other sites

19 hours ago, Bluto said:

If the queries for removing the <p> </p> would have been completely successful, there would be no <br><p> </p><br> to remove.

Yes, I thought exactly the same after I ran that query. I'm not entirely sure why these are being missed, it seems an odd one but then SQL isn't my strong suite so it's likely to be operator error...

Ted.

Link to comment
Share on other sites

4 hours ago, Teddy Rogers said:

Yes, I thought exactly the same after I ran that query. I'm not entirely sure why these are being missed, it seems an odd one but then SQL isn't my strong suite so it's likely to be operator error...

Ted.

I searched everywhere on Google trying to find out if the query was wrong, just about every site was the same.  It did find the correct amount when using this query, but again, it didn't make any changes.

update [IPB Prefix]forums_posts set post = replace(post,'<br><p> </p><br>','') where post like '%<br><p> </p><br>%';

I'm wondering if the original queries have something to do with it not making changes with the new queries.

Link to comment
Share on other sites

IDK.  I'm going to be testing my upgrade again on my local system (I haven't upgraded to v4 yet) and I run the updated query first to see if the br-p-p-br gets removed.  There are also several pieces of software out there which allow someone to modify the database remotely from the software.  I believe NavCat is one of them.  I might end up using that to search and replace if the query doesn't work.  http://www.navicat.com/products

Link to comment
Share on other sites

  • 3 weeks later...
  • 2 weeks later...
  • 2 years later...

Okay. This is an old topic, but I didn't find a fully working solution here. This sql query is almost correct (it was mentioned earlier — #1, #2) : 

UPDATE `ibf_forums_posts` SET post = REPLACE(post,'<p> </p>','');

But for this sql query to work, you need to copy "<p> </ p>" from the database table. I recorded a short video for an example:

In any case, don't forget to make a backup.

Link to comment
Share on other sites

Archived

This topic is now archived and is closed to further replies.

  • Recently Browsing   0 members

    • No registered users viewing this page.
×
×
  • Create New...