Full Version: Persistent unknown problem with suspend/resume [solved]
I'm unable to suspend with 4.4.0-xanmod2 / bfq-v7r11...  Sad

Edit: waiting for news at https://groups.google.com/forum/#!forum/bfq-iosched
Edit2: making some suspending tests under a plain Ubuntu 14.04.3 installation.
Edit3: 4.4.0-xanmod2 does not suspend for me with plain installation (btw the former 4.3.3-xanmod8 suspends rightly). HougeLangley have reported suspending with success.
Edit4: problem switching off the notebook. I tried to repeat the fail, no success.
Edit5: again problem switching off the notebook, but it seems a random issue.

Edit6: solved with patching ubuntu for 4.4.0-7-surface3 by our member Tigerite -- anyway I can't resume after suspend with latest 4.4.3-xanmod7. February 26, 2016.

Edit7: problem shows again with 4.4.3-surface~tigerite and it seems to be related with some Intel's DRM, just testing again this persistent event. February 29, 2016.
Edit8: I think the problem is again related with the BFS-BFS/VRQ patch after testing several kernels to help our member Mijasnik. March 3, 2016.
Edit9: Tigerite's development kernel for Microsoft Surface devices suspends and resumes well, but mouse is very slow in certain moments and also menus in applications take some long time to open. 4.4.4-xanmod8 is unable to suspend (switch off). March 4, 2016.

Note: topic marked as solved definitely after the release of the 4.4.4-xanmod9 with Linux mainline code base, CFS CPU scheduler tuned for better responsiveness, use memory dirty writeback from Zen Interactive concept and KSM memory data deduplication. March 6, 2106!
I wonder if it's a problem of mine only, but I don't care about this little issue because I don't use suspend. 4.4.0-xanmod2 500hz is really charming: I love its amazing low battery consumption with an impressive balancing between speed, performance, strength and stability. A really good kernel IMHO, Xan, I'm really appreciated. Smile
I think there is some kind of regression from v7r8 to v7r11.  :-X

I wonder if  '__GFP_WAIT got renamed to __GFP_DIRECT_RECLAIM' would be the regression here, because this change allows to work the 4.3-v7r8 with 4.4-rcX-v7r8 and also with 4.4.0, but it can't be found in v7r11 or at least I can't find similar function to handle it. https://groups.google.com/forum/#!topic/...e-RA9y-uBY

(1) http://algo.ing.unimo.it/people/paolo/di....4.0.patch
(2) http://algo.ing.unimo.it/people/paolo/di....4.0.patch
(3) http://algo.ing.unimo.it/people/paolo/di...-for.patch
"Hi community,
I updated BFQ v7r8 to work with Linux 4.4-rc5.
Patch 4 can be merged if needed in a later state. Needed since rc5. rc1 - rc4 will work with the patches 1 - 3.
Changes since v4.3:
__GFP_WAIT got renamed to __GFP_DIRECT_RECLAIM

Edit: something wrong with old CPUs, perhaps?
The suspend issues are more likely to be related to BFS-VFQ: http://cchalpha.blogspot.co.uk/2016/01/f...l-v44.html a fix is coming..
Currently 4.4.0-xanmod2 works fine for me. But I am analyzing...
4.4.0-xanmod3 still doesn't suspend for me. 4.4.0-mainline suspends O.K., so it means that something is wrong with patches: http://cchalpha.blogspot.co.uk/2016/01/v...eased.html  https://groups.google.com/forum/#!forum/bfq-iosched
Edit: topic title changed to 'unknown problem' issue.
dmesg > dmesg_tropic_4.4.0-xanmod3.log
and attach to your topic: http://xanmod.org/forum/index.php/topic,29.0.html

Uploaded. Smile
Edit: 4.4.1-mainline suspends fine also.
Edit2: attachment deleted as no feedback was provided. February 17, 2016.
"The causes are complicated, the most major one is I have removed some code path to reschedule a cpu/rq after putting a task into the global run queue.  The second one maybe a circle deadlock in mainline, I catch the dmesg twice during my 200+ suspend/resume tests, and reduce the task cached time-out seems to helping with the resume success rate.
In this release, beside adding back the code to pump the scheduler, the NORMAL policy task caching time-out has been changed to 3ms, all rt policy task caching time-out to 0ms(in fact that rt policy tasks never be impacted by caching time-out, unless they are changed to NORMAL policy after caching). Issue 1 and 2 are fixed, issue 3 tested with 10 suspend/resume in console and 10 suspend/resume in X, so the failure rate of suspend/resume should be <5%.

4.4.0-mainline suspends rightly so there is no such a circle deadlock. The failure rate is just unacceptable. http://cchalpha.blogspot.co.uk/2016/01/v...eased.html
Edit: topic title changed, pointing towards VRQ2. Sad
4.4.1-xanmod4 still doesn't suspend for me, as expected because no newer updated patches are merged -- 4.4.1-mainline suspends fine and it resumes also fine. It seems that this failure has very little impact on vrq2/bfq-v7r11 users, as hopefully only I have reported the problem. :-X

Edit: waiting for future Con Kolivas BFS v468 4.4-ck, with gold rush!  Tongue
Some bottlenecks observed with high speed 100 Mb/s public Wi-Fi network with 4.4.1-xanmod4. With 4.3.5-xanmod10 the experience was good. VRQ2 related issue?  ???

Searched for some info, but it pointed to hardware issues and non-sequential I/O.
Solved for me after adding the next lines to sysctl.conf file:
~$ sudo gedit /etc/sysctl.conf
~$ sysctl -p

Edit: just look for duplicated parameters and add # to the old ones.
Edit2: it's better to write those options at the end to locate them easily.
Edit3: additional settings to reduce bottlenecks over high-speed, if needed:

Edit4: if still weird bottlenecks are observed at high-speed, then:
# don't use this setting on very low speed networks
# use carefully and only if weird bottlenecks are observed
# add # to the old value net.ipv4.tcp_timestamps=1

Just have a look to my sysctl.conf file, it works for me like a charm! 8)
I have been testing 4.5.0-mainline-rc3 and I am surprised for its performance. I think that something is wrong here but I don't know why, perhaps it's something wrong with VRQ2 or unexpected unsolved mainline bugs. :-X 4.4.1-xanmod4 is very good in speed but 4.3.5-xanmod10 is more stable than 4.4.1-xanmod4 at least for me. BTW 4.3.5-xanmod10 is merged with 'old' Colivas and BFQ patches. Probably my computer is too old also. :-X http://kernel.ubuntu.com/~kernel-ppa/mai...-rc3-wily/
Just one example to make 'value' to my former post, transferring ratios from devices:
kernel ,   CP, DR, DBR,   SDcard, USB, camera, phone, HDD, SSD, mean
4.4.1-xanmod4, 65, 65, 65,   7.20 , 7.60 , 12.00 , 11.00 , 9.90 , 12.90 , 10.10
4.5-rc3-mainline, 65, 65, 65,   7.90 , 8.10 , 13.00 , 11.90 , 10.50 , 13.90 , 10.88
4.4.1-xanmod4 ,  75, 75, 75,   7.80 , 7.90 , 12.10 , 11.80 , 10.00 , 13.60 , 10.53

* vm.vfs_cache_pressure CP, vm.dirty_ratio DR,  vm.dirty_background_ratio DBR, SDcard, USB, camera, phone, HDD, SSD, mean)
Tested NickTh's 4.4.1+BFQv7r11 kernel with suspending success in 10/10 times so I can conclude that the guilty of my long in time problem with suspend on 4.4 series is the VRQ2 patch -- by now inside 4.4.1-xanmod4 as latest. Topic title changed to inform about this absolutely annoying and unsolved issue. Such a pity!  Sad
~$ uname -a
Linux tropic 4.4.0-11-bfq #201602071458-Ubuntu SMP PREEMPT Sun Feb 7 13:04:56 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux
~$ dmesg | grep -i -e 'bfs' -e 'bfq' -e 'toi' -e tux -e 'uksm' -e 'gcc'
[    0.000000] Linux version 4.4.0-11-bfq (buildd@lgw01-18) (gcc version 4.8.4 (Ubuntu 4.8.4-2ubuntu1~14.04) ) #201602071458-Ubuntu SMP PREEMPT Sun Feb 7 13:04:56 UTC 2016
[    1.748140] io scheduler bfq registered (default)
[    1.748146] BFQ I/O-scheduler: v7r11

Edit: thanks again to Tigerite for pointed to the right cause of this main topic.

Topic title changed, my notebook is still unable to suspend with vrq3.  ???
Edit: topic title changed for 4.4.2-xanmod6/vrq3.
Advice: please, notice this is related to VRQ3 problem, not properly a XanMod's one.
If you like, you can try the kernel I've created a PPA for, at ppa:tigerite/kernel .. it's really for Microsoft Surface devices which needs particular patches (and a recent DRM tree for Surface 3), but it has many of XanMod's features patched-in, namely

Set Swappiness to 10
Set Zswap compressor to LZ4
Set cache pressure to 75
Set uksmd nice priority to 15
BFQ I/O-scheduler v7r11 (with fixed Kconfig BFQ dependency on BLK_CGROUP)
Built with GCC-5.3.1

But.. no BFS. I have instead patched in Zen interactivity from liquorix.net (actually everything from the 4.4.1-1 patch that wasn't already present in Ubuntu's 4.4.0-4.19), but with Ubuntu's config except for extras from liquorix. It was quite a job to get everything in! You could try and see if suspend is an issue, if not, maybe Xanmod may consider going with Zen interactivity instead of BFS?
