Default dependency to svc:/milestone/sysconfig incorrect on Solaris 11
cruschke opened this issue · comments
The documentations is saying that these are the default dependencies
- svc:/milestone/sysconfig
- svc:/system/filesystem/local
- svc:/milestone/name-services
- svc:/milestone/network
but svc:/milestone/sysconfig does not exist anymore on Solaris 11.
Solaris 10
mysolaris10 :~\> svcs svc:/milestone/sysconfig svc:/system/filesystem/local svc:/milestone/name-services svc:/milestone/network
STATE STIME FMRI
online Oct_22 svc:/milestone/network:default
online Oct_22 svc:/system/filesystem/local:default
online Oct_22 svc:/milestone/sysconfig:default
online Oct_22 svc:/milestone/name-services:default
Solaris 11
mysolaris11 :~\> svcs svc:/milestone/sysconfig svc:/system/filesystem/local svc:/milestone/name-services svc:/milestone/network
svcs: Pattern 'svc:/milestone/sysconfig' doesn't match any instances
STATE STIME FMRI
online Jun_06 svc:/milestone/network:default
online Jun_06 svc:/system/filesystem/local:default
online Jun_06 svc:/milestone/name-services:default
But on Solaris 11 there is a svc:/milestone/config:default instead that sounds to be the replacement for svc:/milestone/sysconfig.
Honestly, I think the default dependencies need to be reworked anyways. I
should be able to push a fix later today, but would welcome some thoughts
on a better way to organize default deps.
/cc @bixu
Sent from my iPhone
On Jun 10, 2014, at 3:40 AM, Christian Ruschke notifications@github.com
wrote:
The documentations is saying that these are the default dependencies
- svc:/milestone/sysconfig
- svc:/system/filesystem/local
- svc:/milestone/name-services
- svc:/milestone/network
but svc:/milestone/sysconfig does not exist anymore on Solaris 11.
Solaris 10
mysolaris10 :~> svcs svc:/milestone/sysconfig
svc:/system/filesystem/local svc:/milestone/name-services
svc:/milestone/network
STATE STIME FMRI
online Oct_22 svc:/milestone/network:default
online Oct_22 svc:/system/filesystem/local:default
online Oct_22 svc:/milestone/sysconfig:default
online Oct_22 svc:/milestone/name-services:default
Solaris 11
mysolaris11 :~> svcs svc:/milestone/sysconfig
svc:/system/filesystem/local svc:/milestone/name-services
svc:/milestone/network
svcs: Pattern 'svc:/milestone/sysconfig' doesn't match any instances
STATE STIME FMRI
online Jun_06 svc:/milestone/network:default
online Jun_06 svc:/system/filesystem/local:default
online Jun_06 svc:/milestone/name-services:default
But on Solaris 11 there is a svc:/milestone/config:default instead that
sounds to be the replacement for svc:/milestone/sysconfig.
—
Reply to this email directly or view it on GitHub
#15.
@cruschke I think I'm going to hack this fix together in the short term. I don't have a Solaris 11 machine to test, however. Could you let me know what node.platform
, node.platform_family
and node.platform_version
are?
Looking at Ohai, my worry is that it will show up as solaris2
, which is indistinguishable from Solaris 10. How do I make sure this doesn't break older versions?
Seems like we should try to get "platform_family" to report either "illumos" or "solaris", while "platform" should return "solaris10", "solaris11", "smartos", ect...but I may be misunderstanding the intended use of these automatic attributes?
This is from ohai on my Solaris 11:
"platform": "solaris2",
"platform_family": "solaris2",
"platform_version": "5.11",
In general for platforms that you dont have to test, you could have a look at Fauxhai for what SmartOS would say.
But then also OmniOS should be considered to be supported.
I think using only_if (or not_if) with "os_version": "5.11" node attributes should properly distinguish between a Solaris10 and Solaris11 and their forks (SmartOS/OmniOS). platform_version would not work on OmniOS as they put in their build number into that value.
@cruschke sorry for the delay on a fix, but I just want to ensure that I don't break solaris 10 compatibility. Do you happen to have a Solaris 10 machine available to verify that platform_version
is reported at 5.10
?
Sure I do. This is Ohai output on Solaris 10:
"platform_version": "5.10",
"platform": "solaris2"
@cruschke ok, finally pushed a new version to github and to community.opscode.com. Can you test to ensure that it works out?
Thanks for the looking at this issue, however I believe its not entirely fixed ...
Here is my test scenario:
uname from my Vagrant box
vagrant@live-solaris-11-1:~$ uname -a
SunOS live-solaris-11-1.vagrantup.com 5.11 11.1 i86pc i386 i86pc
This is the debug output of Chef run ...
* smf[splunk] action install [2014-06-20T01:07:13+10:00] INFO: Processing smf[splunk] action install (myrecipe::_splunkforwarder_service_smf line 23)
STATE STIME FMRI
STATE STIME FMRI
[2014-06-20T01:07:14+10:00] DEBUG: Loaded checksum for SMF splunk:
[2014-06-20T01:07:14+10:00] DEBUG: SMF service already exists for /application/management/splunk? false
[2014-06-20T01:07:14+10:00] DEBUG: Creating manifest directory at /var/svc/manifest/application
[2014-06-20T01:07:14+10:00] DEBUG: Writing SMF manifest for splunk
[2014-06-20T01:07:14+10:00] DEBUG: importing SMF manifest /var/svc/manifest/application/splunk.xml
1
[2014-06-20T01:07:15+10:00] DEBUG: Saving checksum for SMF splunk: c2c2d8e7a3bc0d93c326adf72a47a00c
Here is the generated SMF
vagrant@live-solaris-11-1:~$ svccfg export splunk
<?xml version='1.0'?>
<!DOCTYPE service_bundle SYSTEM '/usr/share/lib/xml/dtd/service_bundle.dtd.1'>
<service_bundle type='manifest' name='export'>
<service name='application/management/splunk' type='service' version='0'>
<create_default_instance enabled='true'/>
<single_instance/>
<dependency name='milestone' grouping='require_all' restart_on='none' type='service'>
<service_fmri value='svc:/milestone/sysconfig'/>
</dependency>
<dependency name='fs-local' grouping='require_all' restart_on='none' type='service'>
<service_fmri value='svc:/system/filesystem/local'/>
</dependency>
<dependency name='name-services' grouping='require_all' restart_on='none' type='service'>
<service_fmri value='svc:/milestone/name-services'/>
</dependency>
<dependency name='network' grouping='require_all' restart_on='none' type='service'>
<service_fmri value='svc:/milestone/network'/>
</dependency>
<method_context>
<method_credential privileges='basic,net_privaddr' user='vagrant'/>
</method_context>
<exec_method name='start' type='method' exec='/opt/splunkforwarder/bin/splunk start' timeout_seconds='60'/>
<exec_method name='stop' type='method' exec='/opt/splunkforwarder/bin/splunk stop' timeout_seconds='60'/>
<exec_method name='restart' type='method' exec='/opt/splunkforwarder/bin/splunk restart' timeout_seconds='60'/>
<property_group name='general' type='framework'>
<propval name='action_authorization' type='astring' value='solaris.smf.manage.splunk'/>
<propval name='value_authorization' type='astring' value='solaris.smf.value.splunk'/>
</property_group>
<stability value='Evolving'/>
<template>
<common_name>
<loctext xml:lang='C'>splunk</loctext>
</common_name>
</template>
</service>
</service_bundle>
As you can see its still depending on milestone sysconfig although it should be config instead.
Oh... I think I know why this is happening, but I'll need to look a bit to
find the right solution.
If I'm correct, new files written will be correct. The problem will only be
updating existing files. You could even delete the checksum file for this
service in /var/chef/checksums to force it to rewrite the file.
Sent from my iPhone
On Jun 19, 2014, at 8:44 AM, Christian Ruschke notifications@github.com
wrote:
Thanks for the looking at this issue, however I believe its not entirely
fixed ...
Here is my test scenario:
uname from my Vagrant box
vagrant@live-solaris-11-1:~$ uname -a
SunOS live-solaris-11-1.vagrantup.com 5.11 11.1 i86pc i386 i86pc
This is the debug output of Chef run ...
- smf[splunk] action install [2014-06-20T01:07:13+10:00] INFO:
Processing smf[splunk] action install
(myrecipe::_splunkforwarder_service_smf line 23)
STATE STIME FMRI
STATE STIME FMRI
[2014-06-20T01:07:14+10:00] DEBUG: Loaded checksum for SMF splunk:
[2014-06-20T01:07:14+10:00] DEBUG: SMF service already exists
for /application/management/splunk? false
[2014-06-20T01:07:14+10:00] DEBUG: Creating manifest directory
at /var/svc/manifest/application
[2014-06-20T01:07:14+10:00] DEBUG: Writing SMF manifest for splunk
[2014-06-20T01:07:14+10:00] DEBUG: importing SMF manifest
/var/svc/manifest/application/splunk.xml
1
[2014-06-20T01:07:15+10:00] DEBUG: Saving checksum for SMF
splunk: c2c2d8e7a3bc0d93c326adf72a47a00c
Here is the generated SMF
vagrant@live-solaris-11-1:~$ svccfg export splunk
<service_bundle type='manifest' name='export'>
<create_default_instance enabled='true'/>
<single_instance/>
<service_fmri value='svc:/milestone/sysconfig'/>
<service_fmri value='svc:/system/filesystem/local'/>
<service_fmri value='svc:/milestone/name-services'/>
<service_fmri value='svc:/milestone/network'/>
<method_context>
<method_credential privileges='basic,net_privaddr' user='vagrant'/>
</method_context>
<exec_method name='start' type='method'
exec='/opt/splunkforwarder/bin/splunk start' timeout_seconds='60'/>
<exec_method name='stop' type='method'
exec='/opt/splunkforwarder/bin/splunk stop' timeout_seconds='60'/>
<exec_method name='restart' type='method'
exec='/opt/splunkforwarder/bin/splunk restart' timeout_seconds='60'/>
<property_group name='general' type='framework'>
</property_group>
<common_name>
splunk
</common_name>
</service_bundle>
As you can see its still depending on milestone sysconfig although it
should be config instead.
—
Reply to this email directly or view it on GitHub
#15 (comment).
The test scenario I am using includes rebuilding the entire VM from scratch (using Testkitchen), so guaranteed there is no leftover SMF.
Recipe: bild_monitoring::_splunkforwarder_service_smf
* smf[splunk] action install [2014-06-20T16:51:35+10:00] INFO: Processing smf[splunk] action install (bild_monitoring::_splunkforwarder_service_smf line 23)
STATE STIME FMRI
STATE STIME FMRI
[2014-06-20T16:51:35+10:00] DEBUG: Loaded checksum for SMF splunk:
[2014-06-20T16:51:35+10:00] DEBUG: SMF service already exists for /application/management/splunk? false
[2014-06-20T16:51:35+10:00] DEBUG: Creating manifest directory at /var/svc/manifest/application
[2014-06-20T16:51:35+10:00] DEBUG: Writing SMF manifest for splunk
[2014-06-20T16:51:35+10:00] DEBUG: importing SMF manifest /var/svc/manifest/application/splunk.xml
1
[2014-06-20T16:51:36+10:00] DEBUG: Saving checksum for SMF splunk: c2c2d8e7a3bc0d93c326adf72a47a00c
[2014-06-20T16:51:36+10:00] INFO: smf[splunk] sending enable action to service[splunk] (immediate)
Recipe: bild_monitoring::default
* service[splunk] action enable [2014-06-20T16:51:36+10:00] INFO: Processing service[splunk] action enable (bild_monitoring::default line 29)
fmri svc:/application/management/splunk:default
name splunk
enabled false
state disabled
next_state none
state_time Fri Jun 20 16:51:37 2014
restarter svc:/system/svc/restarter:default
manifest /var/svc/manifest/application/splunk.xml
dependency require_all/none svc:/milestone/sysconfig (absent)
dependency require_all/none svc:/system/filesystem/local (online)
dependency require_all/none svc:/milestone/name-services (online)
dependency require_all/none svc:/milestone/network (online)
Ok, I'm going to do what I should have done from the start and try to get
test-kitchen set up with a test suite.
I see that there's a solaris 11 vagrant box available here:
https://vagrantcloud.com/ruby-concurrency/oracle-solaris-11. Hopefully
between it and the available OmniOS boxes I can get this sorted out.
On Thu, Jun 19, 2014 at 11:58 PM, Christian Ruschke <
notifications@github.com> wrote:
The test scenario I am using includes rebuilding the entire VM from
scratch (using Testkitchen), so guaranteed there is no leftover SMF.Recipe: bild_monitoring::_splunkforwarder_service_smf
smf[splunk] action install [2014-06-20T16:51:35+10:00] INFO: Processing smf[splunk] action install (bild_monitoring::_splunkforwarder_service_smf line 23)
STATE STIME FMRI
STATE STIME FMRI
[2014-06-20T16:51:35+10:00] DEBUG: Loaded checksum for SMF splunk:
[2014-06-20T16:51:35+10:00] DEBUG: SMF service already exists for /application/management/splunk? false
[2014-06-20T16:51:35+10:00] DEBUG: Creating manifest directory at /var/svc/manifest/application
[2014-06-20T16:51:35+10:00] DEBUG: Writing SMF manifest for splunk
[2014-06-20T16:51:35+10:00] DEBUG: importing SMF manifest /var/svc/manifest/application/splunk.xml
1
[2014-06-20T16:51:36+10:00] DEBUG: Saving checksum for SMF splunk: c2c2d8e7a3bc0d93c326adf72a47a00c[2014-06-20T16:51:36+10:00] INFO: smf[splunk] sending enable action to servicesplunk
Recipe: bild_monitoring::defaultservice[splunk] action enable [2014-06-20T16:51:36+10:00] INFO: Processing service[splunk] action enable (bild_monitoring::default line 29)
fmri svc:/application/management/splunk:default
name splunk
enabled false
state disabled
next_state none
state_time Fri Jun 20 16:51:37 2014
restarter svc:/system/svc/restarter:default
manifest /var/svc/manifest/application/splunk.xml
dependency require_all/none svc:/milestone/sysconfig (absent)
dependency require_all/none svc:/system/filesystem/local (online)
dependency require_all/none svc:/milestone/name-services (online)
dependency require_all/none svc:/milestone/network (online)—
Reply to this email directly or view it on GitHub
#15 (comment).
e s
@cruschke it looks like this may be an issue with ohai. Look at this commit, which appears to fix the detection of Oracle Solaris: chef/ohai@030077c. It looks like the current release of ohai is too strict in its pattern matching. I see in /etc/release
on Solaris 11 that the text is "Oracle Solaris 11.1 X86", which is not going to be matched by the regex in ohai 7.0.4 or earlier.
$ ohai --version
Ohai: 7.0.4
$ ohai | grep platform
"platform_version": "5.11",
"platform_build": "11.1",
"platform_family": null,
$ sudo /opt/chef/embedded/bin/gem install ohai --pre
....
$ ohai --version
Ohai: 7.2.0.alpha.0
$ ohai | grep platform
"platform_version": "5.11",
"platform_build": "11.1",
"platform": "solaris2",
"platform_family": "solaris2",
I definitely want to match on platform == "solaris2"
, so I can specifically target Oracle Solaris.
...and it was a stupid typo I made 😄. I just pushed a new version (2.0.6) of the cookbook to the community site with the correct platform matching.
I've known I had to set up test-kitchen for a while, as well as write up some chefspecs. This is a good incentive!
As a note, I had to use the latest alpha version of Chef and Ohai in order for this to work. I'm not sure if it's dependent on the specific release of Solaris 11, or maybe older versions of Chef work, as it appears from above that the versions you're using correctly match solaris2
.
Thanks, now its working as expected on both Solaris 10 and 11 👍.